广义监督对比学习

论文标题

广义监督对比学习

Generalized Supervised Contrastive Learning

论文作者

Kim, Jaewon, Lee, Hyukjong, Chang, Jooyoung, Park, Sang Min

论文摘要

随着自我监督学习范式中对比度学习的最新结果，受监督的对比学习成功将这些对比方法扩展到了受监督的环境，表现优于各种数据集上的交叉渗透率。但是，受监督的对比学习本质上以二进制形式（无论是阳性还是负面）使用单速目标向量以二进制形式采用标签信息。这种结构努力适应将标签信息作为概率分布（例如cutmix和知识蒸馏）的方法。在本文中，我们引入了广义监督的对比损失，该损失衡量了标签相似性和潜在相似性之间的跨深思。这个概念通过充分利用标签分布并实现各种现有技术来培训现代神经网络，从而增强了监督对比损失的能力。利用这一广义监督的对比损失，我们构建了一个量身定制的框架：广义监督对比学习（GENSCL）。与现有的对比学习框架相比，GENSCL结合了其他增强功能，包括基于高级图像的正则化技术和任意教师分类器。当使用动量对比技术应用于RESNET50时，GENSCL在Imagenet上获得了77.3％的前1位准确性，比传统监督对比学习的相对相对改善4.1％。此外，当应用于RESNET50时，我们的方法在CIFAR10和CIFAR100上分别建立了98.2％和87.0％的新最新精确度，标志着该体系结构的最高报告数字。

With the recent promising results of contrastive learning in the self-supervised learning paradigm, supervised contrastive learning has successfully extended these contrastive approaches to supervised contexts, outperforming cross-entropy on various datasets. However, supervised contrastive learning inherently employs label information in a binary form--either positive or negative--using a one-hot target vector. This structure struggles to adapt to methods that exploit label information as a probability distribution, such as CutMix and knowledge distillation. In this paper, we introduce a generalized supervised contrastive loss, which measures cross-entropy between label similarity and latent similarity. This concept enhances the capabilities of supervised contrastive loss by fully utilizing the label distribution and enabling the adaptation of various existing techniques for training modern neural networks. Leveraging this generalized supervised contrastive loss, we construct a tailored framework: the Generalized Supervised Contrastive Learning (GenSCL). Compared to existing contrastive learning frameworks, GenSCL incorporates additional enhancements, including advanced image-based regularization techniques and an arbitrary teacher classifier. When applied to ResNet50 with the Momentum Contrast technique, GenSCL achieves a top-1 accuracy of 77.3% on ImageNet, a 4.1% relative improvement over traditional supervised contrastive learning. Moreover, our method establishes new state-of-the-art accuracies of 98.2% and 87.0% on CIFAR10 and CIFAR100 respectively when applied to ResNet50, marking the highest reported figures for this architecture.

下载PDF全文

下载文献需遵守相关版权规定

论文标题