在分布式自适应优化和梯度压缩方面

论文标题

在分布式自适应优化和梯度压缩方面

On Distributed Adaptive Optimization with Gradient Compression

论文作者

Li, Xiaoyun, Karimi, Belhal, Li, Ping

论文摘要

我们研究Comp-AMS，这是一个基于梯度平均和适应性AMSGRAD算法的分布式优化框架。使用错误反馈的梯度压缩以降低梯度传输过程中的通信成本。我们对COMP-AMS的收敛分析表明，这种压缩梯度平均策略的收敛速率与标准AMSGRAD相同，并且也表现出线性加速效应W.R.T.当地工人的数量。与最近提出的关于分布式自适应方法的协议相比，Comp-Ams简单易用。进行数值实验是为了证明理论发现是合理的，并证明所提出的方法可以达到与具有大量沟通节省的全级AMSGRAD相同的测试准确性。凭借其简单性和效率，Comp-Ams可以作为适应性梯度方法的有用的分布式培训框架。

We study COMP-AMS, a distributed optimization framework based on gradient averaging and adaptive AMSGrad algorithm. Gradient compression with error feedback is applied to reduce the communication cost in the gradient transmission process. Our convergence analysis of COMP-AMS shows that such compressed gradient averaging strategy yields same convergence rate as standard AMSGrad, and also exhibits the linear speedup effect w.r.t. the number of local workers. Compared with recently proposed protocols on distributed adaptive methods, COMP-AMS is simple and convenient. Numerical experiments are conducted to justify the theoretical findings, and demonstrate that the proposed method can achieve same test accuracy as the full-gradient AMSGrad with substantial communication savings. With its simplicity and efficiency, COMP-AMS can serve as a useful distributed training framework for adaptive gradient methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题