论文标题
神经网络中简单性偏见的陷阱
The Pitfalls of Simplicity Bias in Neural Networks
论文作者
论文摘要
几项作品提出了简单性偏见(SB)---标准训练程序(例如随机梯度下降(SGD))的趋势,以找到简单模型----证明神经网络为何良好的概述[Arpit等。 2017年,Nakkiran等。 2019年,Soudry等。 2018]。但是,简单性的确切概念仍然模糊。此外,以前使用SB来理论上的设置在理论上证明了为什么神经网络良好地概括了为什么同时捕获神经网络的非舒适性 - 在实践中是广泛观察到的现象[Goodfellow等。 2014年,Jo and Bengio 2017]。我们试图通过设计(a)包含精确的简单性概念的数据集和在实践中观察到的非稳定性来调和SB和在实践中观察到的非稳定性的出色标准概括,(b)构成了多个预测性特征,具有不同级别的简单性,并且(c)捕获对实际数据培训的神经网络的非自以为是。通过这些数据集上的理论和经验,我们进行了四个观察:(i)SGD和变体的SB可能是极端的:神经网络可以完全依靠最简单的功能,并且对所有预测性复杂特征保持不变。 (ii)SB的极端方面可以解释为什么看似良性的分布变化和小的对抗扰动会大大降低模型性能。 (iii)与传统的智慧相反,SB也可能损害相同数据分布的概括,因为SB仍然存在于最简单的功能的预测能力较小,而不是更复杂的功能。 (iv)改善概括和鲁棒性的常见方法---合奏和对抗性训练 - - 缓解SB及其陷阱可能会失败。鉴于SB在训练神经网络中的作用,我们希望所提出的数据集和方法是评估旨在避免SB陷阱的新型算法方法的有效测试台。
Several works have proposed Simplicity Bias (SB)---the tendency of standard training procedures such as Stochastic Gradient Descent (SGD) to find simple models---to justify why neural networks generalize well [Arpit et al. 2017, Nakkiran et al. 2019, Soudry et al. 2018]. However, the precise notion of simplicity remains vague. Furthermore, previous settings that use SB to theoretically justify why neural networks generalize well do not simultaneously capture the non-robustness of neural networks---a widely observed phenomenon in practice [Goodfellow et al. 2014, Jo and Bengio 2017]. We attempt to reconcile SB and the superior standard generalization of neural networks with the non-robustness observed in practice by designing datasets that (a) incorporate a precise notion of simplicity, (b) comprise multiple predictive features with varying levels of simplicity, and (c) capture the non-robustness of neural networks trained on real data. Through theory and empirics on these datasets, we make four observations: (i) SB of SGD and variants can be extreme: neural networks can exclusively rely on the simplest feature and remain invariant to all predictive complex features. (ii) The extreme aspect of SB could explain why seemingly benign distribution shifts and small adversarial perturbations significantly degrade model performance. (iii) Contrary to conventional wisdom, SB can also hurt generalization on the same data distribution, as SB persists even when the simplest feature has less predictive power than the more complex features. (iv) Common approaches to improve generalization and robustness---ensembles and adversarial training---can fail in mitigating SB and its pitfalls. Given the role of SB in training neural networks, we hope that the proposed datasets and methods serve as an effective testbed to evaluate novel algorithmic approaches aimed at avoiding the pitfalls of SB.