集中式，联合和分散的非covex优化的二阶保证

论文标题

集中式，联合和分散的非covex优化的二阶保证

Second-Order Guarantees in Centralized, Federated and Decentralized Nonconvex Optimization

论文作者

Vlaski, Stefan, Sayed, Ali H.

论文摘要

数据收集和处理功能的快速进步允许使用日益复杂的模型，从而引起非convex优化问题。但是，这些表述通常很难解决，从某种意义上说，即使简单地验证给定点是局部最小值也可以是np hard [1]。尽管如此，一些相对简单的算法仍显示出在许多感兴趣的情况下导致出人意料的经验结果。也许最突出的例子是培训神经网络的反向传播算法的成功。最近的一些著作通过研究非凸优化问题的结构并确定简单算法（例如梯度下降及其变化）在融合到局部微型玛和避免鞍点方面表现良好。这些分析中的一个关键见解是，梯度扰动在允许局部下降算法有效区分不良的固定点并逃离后者方面起着至关重要的作用。在本文中，我们介绍了针对集中式，联邦和分散式体系结构中随机一阶优化算法的二阶保证的最新结果。

Rapid advances in data collection and processing capabilities have allowed for the use of increasingly complex models that give rise to nonconvex optimization problems. These formulations, however, can be arbitrarily difficult to solve in general, in the sense that even simply verifying that a given point is a local minimum can be NP-hard [1]. Still, some relatively simple algorithms have been shown to lead to surprisingly good empirical results in many contexts of interest. Perhaps the most prominent example is the success of the backpropagation algorithm for training neural networks. Several recent works have pursued rigorous analytical justification for this phenomenon by studying the structure of the nonconvex optimization problems and establishing that simple algorithms, such as gradient descent and its variations, perform well in converging towards local minima and avoiding saddle-points. A key insight in these analyses is that gradient perturbations play a critical role in allowing local descent algorithms to efficiently distinguish desirable from undesirable stationary points and escape from the latter. In this article, we cover recent results on second-order guarantees for stochastic first-order optimization algorithms in centralized, federated, and decentralized architectures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题