论文标题
过度参数化的Relu神经网络学习最简单的模型:神经等轴测和精确恢复
Overparameterized ReLU Neural Networks Learn the Simplest Models: Neural Isometry and Exact Recovery
论文作者
论文摘要
深度学习的实践表明,即使有大量学习的参数,神经网络也可以很好地概括。这似乎与传统的统计智慧相矛盾,在这种统计观念中,模型复杂性与适合数据之间的权衡至关重要。我们的目标是通过采用凸优化和稀疏的恢复观点来解决这一差异。我们考虑具有标准重量衰减正则化的两层relu网络的训练和概括。在数据上的某些规律性假设下,我们显示具有任意数量参数的Relu网络仅学习解释数据的简单模型。这类似于压缩传感中最稀少线性模型的恢复。对于具有跳过连接或归一化层的Relu网络及其变体,我们提出了确保植物神经元精确恢复的等轴测条件。对于随机生成的数据,我们显示了恢复种植的神经网络模型中的相变的存在,这很容易描述:每当样品数量和维度之间的比率超过数值阈值时,恢复就会以很高的概率成功;否则,它的可能性很高。令人惊讶的是,Relu Networks学习了简单而稀疏的模型,即使标签嘈杂,也可以很好地概括。相变现象通过数值实验证实。
The practice of deep learning has shown that neural networks generalize remarkably well even with an extreme number of learned parameters. This appears to contradict traditional statistical wisdom, in which a trade-off between model complexity and fit to the data is essential. We aim to address this discrepancy by adopting a convex optimization and sparse recovery perspective. We consider the training and generalization properties of two-layer ReLU networks with standard weight decay regularization. Under certain regularity assumptions on the data, we show that ReLU networks with an arbitrary number of parameters learn only simple models that explain the data. This is analogous to the recovery of the sparsest linear model in compressed sensing. For ReLU networks and their variants with skip connections or normalization layers, we present isometry conditions that ensure the exact recovery of planted neurons. For randomly generated data, we show the existence of a phase transition in recovering planted neural network models, which is easy to describe: whenever the ratio between the number of samples and the dimension exceeds a numerical threshold, the recovery succeeds with high probability; otherwise, it fails with high probability. Surprisingly, ReLU networks learn simple and sparse models that generalize well even when the labels are noisy . The phase transition phenomenon is confirmed through numerical experiments.