Multimedia Databases

KUTM-FID Database – 2019

The food intake database (KUTM-FID) is constructed from tracheal TM recordings. KUTM-FID has recordings of 8 subjects, 4 males and 4 females, between 22-29 years old. Participants did not receive incentives for participation. Intake signals are collected in laboratory environment with the iASUS NT3 TM at 16 kHz sampling rate. All participants have no history of chewing and swallowing abnormalities. Data is collected in clean lab conditions. Subjects consume the prepared foods in the same amount and order. A total of 10 different intake tasks are recorded including chewing and swallowing of solids (potato chips, cake, biscuits, stick crackers, peanut and chocolate), and swallowing of liquids (water, milk, fizzy drink, and fruit juice). Each subject visits the laboratory 5 times and consumes these 10 different food items at one recording session. Average duration of each visit for each subject is around 7 min, and the total duration is 276 min.

The KUTM-FID database will be available soon for academic purposes.

JESTKOD Database – 2016

The JESTKOD database consists of dyadic interaction recordings of 10 participants, 4 female and 6 male, ages from 20 to 25. Agreement and disagreement interactions of the 5 dyads are collected in 5 sessions, all in Turkish. Each participant inter- acted with the same partner for both agreement and disagreement settings and only appeared in one session. In each session, there are 19-23 clip recordings of 2-4 minutes, where in each clip participants pick a topic that they agree or disagree, and engage into a dyadic interaction. The total duration of the recordings is 259 minutes.

The JESTKOD database is available upon request for academic purposes. You may contact E. Erzin.

Sample files from the JESTKOD

Audio-Visual Database (MVGL-AVD)

The MVGL audio-visual database has been collected for multimodal speaker identification/verification applications. The audio-visual data have been acquired using Sony DSR-PD150P video camera at Multimedia Vision and Graphics Laboratory of Koç University. The database includes 50 subjects, where each subject utters ten repetitions of her/his name as the secret phrase. A set of impostor data is also available with each subject in the population, uttering five different names from the population.

The MVGL-AVD database is available upon request for academic purposes.

Sample images and videos from the MVGL-AVD database:

Story Telling Audio-Visual Database (MVGL-MASAL)

The MVGL-MASAL is a gesture-speech database. The database includes four recordings of a single subject telling stories in Turkish. Each story is approximately 7 minutes long and the total duration of the database is 27 min and 45 seconds. The audio-visual data is synchronously captured from the stereo camera and sound card. The stereo video includes only upper body gestures with 30 frames per second whereas the audio is recorded with 16 kHz sampling rate and 16 bits per sample.