论文标题
SLT 2021儿童语音识别挑战:打开数据集,规则和基线
The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines
论文作者
论文摘要
使用深度学习和大数据,自动语音识别(ASR)已大大提高。但是,改善的鲁棒性,包括在不同的演讲者和口音上取得同样良好的表现,仍然是一个具有挑战性的问题。特别是,由于1)儿童语音识别(CSR)的表现仍然落后于1)儿童声音的语言和语言特征与成人的语音和语言特征大不相同,而2)研究社区仍无法使用大量的儿童言语开放数据集。为了解决这些问题,我们启动了儿童言语识别挑战(CSRC),作为IEEE SLT 2021车间的旗舰卫星活动。挑战将为注册团队发布大约400个小时的普通话语音数据,并设置两个挑战赛,并提供一个常见的测试台,以基于CSR的表现。在本文中,我们介绍数据集,规则,评估方法以及基准。
Automatic speech recognition (ASR) has been significantly advanced with the use of deep learning and big data. However improving robustness, including achieving equally good performance on diverse speakers and accents, is still a challenging problem. In particular, the performance of children speech recognition (CSR) still lags behind due to 1) the speech and language characteristics of children's voice are substantially different from those of adults and 2) sizable open dataset for children speech is still not available in the research community. To address these problems, we launch the Children Speech Recognition Challenge (CSRC), as a flagship satellite event of IEEE SLT 2021 workshop. The challenge will release about 400 hours of Mandarin speech data for registered teams and set up two challenge tracks and provide a common testbed to benchmark the CSR performance. In this paper, we introduce the datasets, rules, evaluation method as well as baselines.