论文标题

MARBLENET:深1频道可分离的卷积神经网络,用于语音活动检测

MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network for Voice Activity Detection

论文作者

Jia, Fei, Majumdar, Somshubra, Ginsburg, Boris

论文摘要

我们提出了Marblenet,这是语音活动检测(VAD)的端到端神经网络。 Marblenet是一个深度残留网络,由1D时间通道可分离卷积,批量归一化,relu和辍学层组成。与最先进的VAD模型相比,Marblenet能够以大约1/10的参数成本实现相似的性能。我们进一步对参数的不同培训方法和选择进行了广泛的消融研究,以研究Marblenet在现实世界中VAD任务中的鲁棒性。

We present MarbleNet, an end-to-end neural network for Voice Activity Detection (VAD). MarbleNet is a deep residual network composed from blocks of 1D time-channel separable convolution, batch-normalization, ReLU and dropout layers. When compared to a state-of-the-art VAD model, MarbleNet is able to achieve similar performance with roughly 1/10-th the parameter cost. We further conduct extensive ablation studies on different training methods and choices of parameters in order to study the robustness of MarbleNet in real-world VAD tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源