Doublematt：通过自我审视改善半监督学习

论文标题

Doublematt：通过自我审视改善半监督学习

DoubleMatch: Improving Semi-Supervised Learning with Self-Supervision

论文作者

Wallin, Erik, Svensson, Lennart, Kahl, Fredrik, Hammarstrand, Lars

论文摘要

在监督学习的成功之后，半监督学习（SSL）现在变得越来越流行。 SSL是一种方法家族，除了标记的训练集外，还使用大量未标记的数据来拟合模型。最近的大多数成功的SSL方法基于伪标记的方法：让自信模型预测作为培训标签。尽管这些方法在许多基准数据集上显示出令人印象深刻的结果，但这种方法的缺点是，在培训期间并非所有未标记的数据都使用。我们提出了一种新的SSL算法，即Doublematch，该算法将伪标记技术与自我监督的损失相结合，从而使模型能够在培训过程中利用所有未标记的数据。我们表明，此方法可在多个基准数据集上实现最新的精度，同时还减少了与现有SSL方法相比的训练时间。代码可在https://github.com/walline/doublemart上找到。

Following the success of supervised learning, semi-supervised learning (SSL) is now becoming increasingly popular. SSL is a family of methods, which in addition to a labeled training set, also use a sizable collection of unlabeled data for fitting a model. Most of the recent successful SSL methods are based on pseudo-labeling approaches: letting confident model predictions act as training labels. While these methods have shown impressive results on many benchmark datasets, a drawback of this approach is that not all unlabeled data are used during training. We propose a new SSL algorithm, DoubleMatch, which combines the pseudo-labeling technique with a self-supervised loss, enabling the model to utilize all unlabeled data in the training process. We show that this method achieves state-of-the-art accuracies on multiple benchmark datasets while also reducing training times compared to existing SSL methods. Code is available at https://github.com/walline/doublematch.

下载PDF全文

下载文献需遵守相关版权规定

论文标题