肌肉：通过使用共同信息最大化，通过并发的无监督学习来增强半监督学习

论文标题

肌肉：通过使用共同信息最大化，通过并发的无监督学习来增强半监督学习

MUSCLE: Strengthening Semi-Supervised Learning Via Concurrent Unsupervised Learning Using Mutual Information Maximization

论文作者

Xie, Hanchen, Hussein, Mohamed E., Galstyan, Aram, Abd-Almageed, Wael

论文摘要

深度神经网络是强大的，大量的参数化机器学习模型，在监督的学习任务中已表现出很好的表现。但是，通常需要大量的标记数据来训练深层神经网络。已经提出了几种半监督的学习方法，使用少量的标记数据和大量未标记的数据来训练神经网络。随着标记数据的大小降低，这些半监管方法的性能大大降低。我们介绍了基于共同信息的无监督和半监督并发学习（Muscle），这是一种混合学习方法，使用共同信息来结合无监督和半监督的学习。肌肉可以用作神经网络的独立训练计划，也可以将其纳入其他学习方法中。我们表明，所提出的混合模型在包括CIFAR-10，CIFAR-100和MINI-IMAGENET在内的几种标准基准上优于最新技术。此外，随着标记数据量的减少以及偏见的存在，性能增长始终增加。我们还表明，在微调阶段使用仅在未标记的数据上预先训练的模型时，肌肉有可能提高分类性能。

Deep neural networks are powerful, massively parameterized machine learning models that have been shown to perform well in supervised learning tasks. However, very large amounts of labeled data are usually needed to train deep neural networks. Several semi-supervised learning approaches have been proposed to train neural networks using smaller amounts of labeled data with a large amount of unlabeled data. The performance of these semi-supervised methods significantly degrades as the size of labeled data decreases. We introduce Mutual-information-based Unsupervised & Semi-supervised Concurrent LEarning (MUSCLE), a hybrid learning approach that uses mutual information to combine both unsupervised and semi-supervised learning. MUSCLE can be used as a stand-alone training scheme for neural networks, and can also be incorporated into other learning approaches. We show that the proposed hybrid model outperforms state of the art on several standard benchmarks, including CIFAR-10, CIFAR-100, and Mini-Imagenet. Furthermore, the performance gain consistently increases with the reduction in the amount of labeled data, as well as in the presence of bias. We also show that MUSCLE has the potential to boost the classification performance when used in the fine-tuning phase for a model pre-trained only on unlabeled data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题