论文标题

语音病理对自动扬声器验证的影响 - 一项大规模研究

The effect of speech pathology on automatic speaker verification -- a large-scale study

论文作者

Arasteh, Soroosh Tayebi, Weise, Tobias, Schuster, Maria, Noeth, Elmar, Maier, Andreas, Yang, Seung Hee

论文摘要

在数据驱动语音处理的挑战中,主要障碍之一是访问可靠的病理语音数据。尽管公共数据集似乎提供了解决方案,但它们具有潜在的意外暴露于患者健康信息通过重新识别攻击的风险。使用全面的现实病理语音语料库,超过3,800个测试对象涵盖了各个年龄段和语音障碍,我们采用了深度学习驱动的自动扬声器验证(ASV)方法。这导致了明显的同等错误率(EER)为0.89%,标准偏差为0.06%,超过传统基准。我们的全面评估表明,与健康的言论相比,病理言论总体面临的隐私违规风险更高。具体而言,患有烦躁不安的成年人的重新识别风险越来越高,而构造障碍产量等状况与健康扬声器的疾病相当。至关重要的是,语音清晰度不会影响ASV系统的性能指标。在小儿病例,尤其是那些唇cle裂的病例中,记录环境在重新识别中起着决定性的作用。跨病理类型合并数据导致了明显的EER降低,这表明ASV中病理多样性的潜在益处,并伴随着ASV效力的对数增强。从本质上讲,这项研究阐明了病理言论与说话者验证之间的动态,强调了其在维护我们日益数字化的医疗保健时代保护患者机密性方面的关键作用。

Navigating the challenges of data-driven speech processing, one of the primary hurdles is accessing reliable pathological speech data. While public datasets appear to offer solutions, they come with inherent risks of potential unintended exposure of patient health information via re-identification attacks. Using a comprehensive real-world pathological speech corpus, with over n=3,800 test subjects spanning various age groups and speech disorders, we employed a deep-learning-driven automatic speaker verification (ASV) approach. This resulted in a notable mean equal error rate (EER) of 0.89% with a standard deviation of 0.06%, outstripping traditional benchmarks. Our comprehensive assessments demonstrate that pathological speech overall faces heightened privacy breach risks compared to healthy speech. Specifically, adults with dysphonia are at heightened re-identification risks, whereas conditions like dysarthria yield results comparable to those of healthy speakers. Crucially, speech intelligibility does not influence the ASV system's performance metrics. In pediatric cases, particularly those with cleft lip and palate, the recording environment plays a decisive role in re-identification. Merging data across pathological types led to a marked EER decrease, suggesting the potential benefits of pathological diversity in ASV, accompanied by a logarithmic boost in ASV effectiveness. In essence, this research sheds light on the dynamics between pathological speech and speaker verification, emphasizing its crucial role in safeguarding patient confidentiality in our increasingly digitized healthcare era.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源