论文标题

调查端到端的扬声器归属的ASR,用于连续多对话者录音

Investigation of End-To-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings

论文作者

Kanda, Naoyuki, Chang, Xuankai, Gaur, Yashesh, Wang, Xiaofei, Meng, Zhong, Chen, Zhuo, Yoshioka, Takuya

论文摘要

最近,提出了一个端到端(E2E)说话者归类的自动语音识别(SA-ASR)模型,以作为说话者计数,语音识别和扬声器识别的共同模型,以供单声道重叠的语音识别。它显示了由各种说话者组成的模拟语音混合物的有希望的结果。但是,该模型需要对说话者概况的先验知识来执行说话者的识别,从而大大限制了模型的应用。在本文中,我们通过解决没有说话者配置文件的情况来扩展先前的工作。具体而言,我们通过使用E2E SA-ASR模型的内部说话者表示来诊断说话者的发言者的话语,这些说话者的话语是说话者库存中缺少的。我们还为E2E SA-ASR训练的参考标签提出了简单的修改,该标签有助于很好地处理连续的多恋人记录。我们对原始的E2E SA-ASR和拟议方法进行了全面调查。与原始的E2E SA-ASR相比,带有相关的说话者配置文件,所提出的方法在没有任何先前的说话者知识的情况下实现了仔细的性能。我们还表明,E2E SA-ASR模型中的源目标关注提供了有关假设的开始和结束时间的信息。

Recently, an end-to-end (E2E) speaker-attributed automatic speech recognition (SA-ASR) model was proposed as a joint model of speaker counting, speech recognition and speaker identification for monaural overlapped speech. It showed promising results for simulated speech mixtures consisting of various numbers of speakers. However, the model required prior knowledge of speaker profiles to perform speaker identification, which significantly limited the application of the model. In this paper, we extend the prior work by addressing the case where no speaker profile is available. Specifically, we perform speaker counting and clustering by using the internal speaker representations of the E2E SA-ASR model to diarize the utterances of the speakers whose profiles are missing from the speaker inventory. We also propose a simple modification to the reference labels of the E2E SA-ASR training which helps handle continuous multi-talker recordings well. We conduct a comprehensive investigation of the original E2E SA-ASR and the proposed method on the monaural LibriCSS dataset. Compared to the original E2E SA-ASR with relevant speaker profiles, the proposed method achieves a close performance without any prior speaker knowledge. We also show that the source-target attention in the E2E SA-ASR model provides information about the start and end times of the hypotheses.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源