论文标题
通过卷积神经网络和频谱图的数据增强的手术掩盖检测
Surgical Mask Detection with Convolutional Neural Networks and Data Augmentations on Spectrograms
论文作者
论文摘要
在许多研究领域,标记的数据集很难获取。在这里,数据增强有望在神经网络工程和分类任务的背景下克服缺乏培训数据的地方。这里的想法是将过度合适的模型降低到一个小描述性培训数据集的特征分布中。我们试图评估此类数据增强技术,以收集有关音频数据的MEL-SPECTROGRAMENTION的几个卷积神经网络的性能提升的见解。我们展示了数据增强对人类语音样本中手术掩码检测二进制分类任务的影响(比较挑战2020)。另外,我们考虑了四个不同的体系结构来解释增强鲁棒性。结果表明,比较给出的大多数基线表现都胜过。
In many fields of research, labeled datasets are hard to acquire. This is where data augmentation promises to overcome the lack of training data in the context of neural network engineering and classification tasks. The idea here is to reduce model over-fitting to the feature distribution of a small under-descriptive training dataset. We try to evaluate such data augmentation techniques to gather insights in the performance boost they provide for several convolutional neural networks on mel-spectrogram representations of audio data. We show the impact of data augmentation on the binary classification task of surgical mask detection in samples of human voice (ComParE Challenge 2020). Also we consider four varying architectures to account for augmentation robustness. Results show that most of the baselines given by ComParE are outperformed.