论文标题

优化语音转换网络,并具有循环一致性损失的说话者身份

Optimizing voice conversion network with cycle consistency loss of speaker identity

论文作者

Du, Hongqiang, Tian, Xiaohai, Xie, Lei, Li, Haizhou

论文摘要

我们提出了一种新颖的培训计划,以优化使用扬声器身份损失函数的语音转换网络。训练计划不仅可以最大程度地减少框架级频谱损失,还可以最大程度地减少说话者的身份损失。我们引入了一个周期一致性损失,该损失将转换后的语音限制以保持与语音级别相同的说话者身份。虽然提出的培训方案适用于任何语音转换网络,但我们根据本文的平均模型语音转换框架制定了研究。在CMU-极和CSTR-VCTK语料库上进行的实验证实,所提出的方法在说话者相似性方面优于基线方法。

We propose a novel training scheme to optimize voice conversion network with a speaker identity loss function. The training scheme not only minimizes frame-level spectral loss, but also speaker identity loss. We introduce a cycle consistency loss that constrains the converted speech to maintain the same speaker identity as reference speech at utterance level. While the proposed training scheme is applicable to any voice conversion networks, we formulate the study under the average model voice conversion framework in this paper. Experiments conducted on CMU-ARCTIC and CSTR-VCTK corpus confirm that the proposed method outperforms baseline methods in terms of speaker similarity.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源