论文标题
通过多元泰勒多项式参数化优化损失函数
Optimizing Loss Functions Through Multivariate Taylor Polynomial Parameterization
论文作者
论文摘要
深神经网络(DNN)结构和超参数的金属性已成为越来越重要的研究领域。损失功能是一种对DNN有效训练至关重要的Metaknowledge,但是,尚未充分探索它们在金属性中的潜在作用。尽管早期工作的重点是在树代表上进行基因编程(GP),但本文提出了多元泰勒多项式参数化的连续CMA-ES优化。这种方法,Taylorglo,可以更有效地表示和搜索有用的损失功能。在MNIST,CIFAR-10和SVHN基准任务中,Taylorglo找到了新的损失函数,这些损失函数超过了以前通过GP发现的函数以及较少世代相传的标准跨凝性损失。这些功能通过阻止过度适合标签来正规化学习任务,这在可用培训数据有限的任务中特别有用。因此,结果表明,损失函数优化是一种富有成效的金属学习途径。
Metalearning of deep neural network (DNN) architectures and hyperparameters has become an increasingly important area of research. Loss functions are a type of metaknowledge that is crucial to effective training of DNNs, however, their potential role in metalearning has not yet been fully explored. Whereas early work focused on genetic programming (GP) on tree representations, this paper proposes continuous CMA-ES optimization of multivariate Taylor polynomial parameterizations. This approach, TaylorGLO, makes it possible to represent and search useful loss functions more effectively. In MNIST, CIFAR-10, and SVHN benchmark tasks, TaylorGLO finds new loss functions that outperform functions previously discovered through GP, as well as the standard cross-entropy loss, in fewer generations. These functions serve to regularize the learning task by discouraging overfitting to the labels, which is particularly useful in tasks where limited training data is available. The results thus demonstrate that loss function optimization is a productive new avenue for metalearning.