论文标题
自适应多领域语言模型培训语音识别
Adaptive Multi-Corpora Language Model Training for Speech Recognition
论文作者
论文摘要
神经网络语言模型(NNLM)在自动语音识别(ASR)系统中起着至关重要的作用,尤其是在可用文本数据时进行适应任务。实际上,通常对NNLM进行了从多个语料库采样的数据组合进行培训。因此,数据采样策略对适应性性能很重要。大多数现有的作品着重于设计静态抽样策略。但是,在不同的NNLM培训阶段,每个语料库都可能显示出不同的影响。在本文中,我们介绍了一种新型的自适应多企业培训算法,该算法会动态学习和调整每个语料库在训练过程中的采样概率。该算法对语料库的大小和域相关性是可靠的。与静态采样策略基准相比,提出的方法分别在域内和室外适应任务上分别降低了相对7%和9%的单词错误率(WER),从而产生了显着改善。
Neural network language model (NNLM) plays an essential role in automatic speech recognition (ASR) systems, especially in adaptation tasks when text-only data is available. In practice, an NNLM is typically trained on a combination of data sampled from multiple corpora. Thus, the data sampling strategy is important to the adaptation performance. Most existing works focus on designing static sampling strategies. However, each corpus may show varying impacts at different NNLM training stages. In this paper, we introduce a novel adaptive multi-corpora training algorithm that dynamically learns and adjusts the sampling probability of each corpus along the training process. The algorithm is robust to corpora sizes and domain relevance. Compared with static sampling strategy baselines, the proposed approach yields remarkable improvement by achieving up to relative 7% and 9% word error rate (WER) reductions on in-domain and out-of-domain adaptation tasks, respectively.