论文标题
使用对抗扰动和标签噪声测量卷积神经网络中过度拟合
Measuring Overfitting in Convolutional Neural Networks using Adversarial Perturbations and Label Noise
论文作者
论文摘要
尽管存在许多减少卷积神经网络(CNN)过度拟合的方法,但仍不清楚如何自信地衡量过度拟合的程度。但是,反映出过度拟合水平的度量可能非常有用,可对不同体系结构的比较以及评估各种技术以应对过度拟合的技术。由于过度拟合的神经网络倾向于记住训练数据中的噪声而不是普遍看不见数据的事实,我们研究了训练准确性在增加数据扰动的存在下如何变化并研究与过度拟合的联系。尽管以前的工作仅针对标签噪声,但我们研究了一系列技术,将噪声注入训练数据,包括对抗性扰动和输入损坏。基于此,我们定义了两个新的指标,可以自信地区分正确的模型和过度拟合的模型。为了进行评估,我们得出了事先知道过度拟合行为的模型池。为了测试各种因素的效果,我们基于VGG和Resnet引入了建筑中的几种反拟合措施,并研究其影响,包括正则化技术,训练集大小和参数数量。最后,我们通过测量模型池外几个CNN体系结构的过度拟合度来评估所提出的指标的适用性。
Although numerous methods to reduce the overfitting of convolutional neural networks (CNNs) exist, it is still not clear how to confidently measure the degree of overfitting. A metric reflecting the overfitting level might be, however, extremely helpful for the comparison of different architectures and for the evaluation of various techniques to tackle overfitting. Motivated by the fact that overfitted neural networks tend to rather memorize noise in the training data than generalize to unseen data, we examine how the training accuracy changes in the presence of increasing data perturbations and study the connection to overfitting. While previous work focused on label noise only, we examine a spectrum of techniques to inject noise into the training data, including adversarial perturbations and input corruptions. Based on this, we define two new metrics that can confidently distinguish between correct and overfitted models. For the evaluation, we derive a pool of models for which the overfitting behavior is known beforehand. To test the effect of various factors, we introduce several anti-overfitting measures in architectures based on VGG and ResNet and study their impact, including regularization techniques, training set size, and the number of parameters. Finally, we assess the applicability of the proposed metrics by measuring the overfitting degree of several CNN architectures outside of our model pool.