改善语音增强的甘斯

论文标题

改善语音增强的甘斯

Improving GANs for Speech Enhancement

论文作者

Phan, Huy, McLoughlin, Ian V., Pham, Lam, Chén, Oliver Y., Koch, Philipp, De Vos, Maarten, Mertins, Alfred

论文摘要

最近已证明生成对抗网络（GAN）有效地提高语音。但是，大多数（即使不是全部）现有的语音增强剂（SEGAN）利用单个发电机来执行一阶段增强映射。在这项工作中，我们建议使用链接的多个发电机来执行多阶段的增强映射，这些映射逐渐以舞台的方式完善了嘈杂的输入信号。此外，我们研究了两种情况：（1）发电机共享其参数，（2）发电机的参数是独立的。前者限制了发电机以学习在所有增强阶段迭代应用的常见映射，并导致型号占地面积小。相反，后者允许发电机在网络的不同阶段灵活地学习不同的增强映射，而成本增加了模型大小。我们证明，所提出的多阶段增强方法的表现优于单级基线，在该基线中，独立发电机比绑定的发电机更有利的结果。源代码可从http://github.com/pquochuy/idsegan获得。

Generative adversarial networks (GAN) have recently been shown to be efficient for speech enhancement. However, most, if not all, existing speech enhancement GANs (SEGAN) make use of a single generator to perform one-stage enhancement mapping. In this work, we propose to use multiple generators that are chained to perform multi-stage enhancement mapping, which gradually refines the noisy input signals in a stage-wise fashion. Furthermore, we study two scenarios: (1) the generators share their parameters and (2) the generators' parameters are independent. The former constrains the generators to learn a common mapping that is iteratively applied at all enhancement stages and results in a small model footprint. On the contrary, the latter allows the generators to flexibly learn different enhancement mappings at different stages of the network at the cost of an increased model size. We demonstrate that the proposed multi-stage enhancement approach outperforms the one-stage SEGAN baseline, where the independent generators lead to more favorable results than the tied generators. The source code is available at http://github.com/pquochuy/idsegan.

下载PDF全文

下载文献需遵守相关版权规定

论文标题