New Corpus Released for Child Speech Recognition

Researchers have released the GaMMA corpus, a new scientific dataset for studying child speech. The corpus contains detailed recordings of Danish children in both quiet and noisy environments, with synchronized gaze, speech, and motion data. This resource is designed to help train and benchmark ASR models on naturalistic, child-specific speech in classroom-like conditions.

- The GaMMA corpus is the result of a collaboration between the Technical University of Denmark (DTU) and the University of Copenhagen, involving researchers from hearing science, computer science, and linguistics. - The dataset contains recordings from 88 Danish children between the ages of five and seven. - Each child participated in four recording sessions, resulting in a longitudinal corpus that can be used to study developmental changes in speech. - The total corpus size is approximately 29.5 hours of annotated data, making it a significant resource for training and testing speech recognition models. - To simulate a classroom environment, the data was collected in a controlled setting with carefully calibrated noise, including multi-talker babble from other children, played through a surrounding array of 64 loudspeakers. - The multimodal nature of the corpus is a key feature, as it includes synchronized binaural audio, high-resolution video of facial expressions and gestures, and eye-gazing data. - The researchers used a variety of tasks to elicit both scripted and spontaneous speech, including picture naming, sentence repetition, and engaging in conversation with an adult. - Beyond speech recognition, the detailed multimodal data is intended to support research in child language acquisition, human-computer interaction, and the development of hearing assistive devices.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.