自学学习中的特征多样性

论文标题

自学学习中的特征多样性

Feature diversity in self-supervised learning

论文作者

Malviya, Pranshu, Sudhakar, Arjun Vaithilingam

论文摘要

许多关于缩放定律的研究考虑了基本因素，例如模型大小，模型形状，数据集大小和计算功率。这些因素很容易调整，代表了任何机器学习设置的基本要素。但是研究人员还采用了更复杂的因素来估计测试误差和概括性能，并具有高可预测性。这些因素通常针对域或应用。例如，特征多样性主要用于Chen等人促进SYN-to-真实传递。（2021）。由于以前的工作中定义了许多缩放因素，研究这些因素如何在使用CNN模型的自我监督学习的背景下如何影响整体泛化表现将很有趣。个体因素如何促进概括，其中包括不同的深度，宽度或早期停止的训练时代的数量？例如，较高的特征多样性是否会导致在SYN到真实传输以外的复杂环境中保持较高的精度？这些因素如何互相取决于彼此？我们发现最后一层是整个培训中最多样化的。但是，尽管该模型的测试误差随着时代的增加而减少，但其多样性下降。我们还发现多样性与模型宽度直接相关。

Many studies on scaling laws consider basic factors such as model size, model shape, dataset size, and compute power. These factors are easily tunable and represent the fundamental elements of any machine learning setup. But researchers have also employed more complex factors to estimate the test error and generalization performance with high predictability. These factors are generally specific to the domain or application. For example, feature diversity was primarily used for promoting syn-to-real transfer by Chen et al. (2021). With numerous scaling factors defined in previous works, it would be interesting to investigate how these factors may affect overall generalization performance in the context of self-supervised learning with CNN models. How do individual factors promote generalization, which includes varying depth, width, or the number of training epochs with early stopping? For example, does higher feature diversity result in higher accuracy held in complex settings other than a syn-to-real transfer? How do these factors depend on each other? We found that the last layer is the most diversified throughout the training. However, while the model's test error decreases with increasing epochs, its diversity drops. We also discovered that diversity is directly related to model width.

下载PDF全文

下载文献需遵守相关版权规定

论文标题