迭代标签改进：通过基于置信的过滤和数据集分配的强大培训

论文标题

迭代标签改进：通过基于置信的过滤和数据集分配的强大培训

Iterative Label Improvement: Robust Training by Confidence Based Filtering and Dataset Partitioning

论文作者

Haase-Schütz, Christian, Stal, Rainer, Hertlein, Heinz, Sick, Bernhard

论文摘要

最先进的高容量深度神经网络不仅需要大量的标记培训数据，而且还非常容易在此数据中标记错误，通常会导致大量努力和成本，因此限制了深度学习的适用性。为了减轻此问题，我们提出了一种新颖的元培训和标签方案，该方案能够利用深层神经网络的概括能力来使用廉价的未标记数据。我们通过实验表明，通过仅依靠一个网络体系结构以及我们提出的迭代培训和预测步骤的方案，可以显着提高标签质量和由此产生的模型精度。我们的方法可实现最先进的结果，同时是建筑不可知的，因此广泛适用。与处理错误标签的其他方法相比，我们的方法既不需要培训另一个网络，也不一定需要额外的，高度准确的参考标签集。我们的技术没有从标记的集合中删除样品，而是使用其他传感器数据而无需手动标记。此外，我们的方法可用于半监督学习。

State-of-the-art, high capacity deep neural networks not only require large amounts of labelled training data, they are also highly susceptible to label errors in this data, typically resulting in large efforts and costs and therefore limiting the applicability of deep learning. To alleviate this issue, we propose a novel meta training and labelling scheme that is able to use inexpensive unlabelled data by taking advantage of the generalization power of deep neural networks. We show experimentally that by solely relying on one network architecture and our proposed scheme of iterative training and prediction steps, both label quality and resulting model accuracy can be improved significantly. Our method achieves state-of-the-art results, while being architecture agnostic and therefore broadly applicable. Compared to other methods dealing with erroneous labels, our approach does neither require another network to be trained, nor does it necessarily need an additional, highly accurate reference label set. Instead of removing samples from a labelled set, our technique uses additional sensor data without the need for manual labelling. Furthermore, our approach can be used for semi-supervised learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题