MIMO语音压缩和增强基于卷积Denoising AutoCoder

论文标题

MIMO语音压缩和增强基于卷积Denoising AutoCoder

MIMO Speech Compression and Enhancement Based on Convolutional Denoising Autoencoder

论文作者

Li, You-Jin, Wang, Syu-Siang, Tsao, Yu, Su, Borching

论文摘要

对于在物联网环境中与语音相关的应用程序，确定有效的方法来处理干扰噪声并压缩传输中的数据量对于实现高质量服务至关重要。在这项研究中，我们提出了一种基于卷积Denoising自动编码器（CDAE）模型的新型多输入多输出语音压缩和增强（MIMO-SCE）系统，以同时提高语音质量并降低传输数据的尺寸。与常规的单通道和多输入单输出系统相比，可以在处理多个声学信号的应用中使用MIMO系统。我们研究了两个CDAE模型，一个完全卷积网络（FCN）和一个SINC FCN，是MIMO系统中的核心模型。实验结果证实，所提出的MIMO-SCE框架有效地提高了语音质量和清晰度，同时将记录数据的量减少了7倍的传输倍。

For speech-related applications in IoT environments, identifying effective methods to handle interference noises and compress the amount of data in transmissions is essential to achieve high-quality services. In this study, we propose a novel multi-input multi-output speech compression and enhancement (MIMO-SCE) system based on a convolutional denoising autoencoder (CDAE) model to simultaneously improve speech quality and reduce the dimensions of transmission data. Compared with conventional single-channel and multi-input single-output systems, MIMO systems can be employed in applications that handle multiple acoustic signals need to be handled. We investigated two CDAE models, a fully convolutional network (FCN) and a Sinc FCN, as the core models in MIMO systems. The experimental results confirm that the proposed MIMO-SCE framework effectively improves speech quality and intelligibility while reducing the amount of recording data by a factor of 7 for transmission.

下载PDF全文

下载文献需遵守相关版权规定

论文标题