论文标题

用动态聚类进行梯度编码,以缓解散曲。

Gradient Coding with Dynamic Clustering for Straggler Mitigation

论文作者

Buyukates, Baturalp, Ozfatura, Emre, Ulukus, Sennur, Gunduz, Deniz

论文摘要

在分布式同步梯度下降(GD)中,每期完成时间的主要性能瓶颈是最慢的\ textit {straggling}工人。为了加快在散滴者存在下的GD迭代,通过将冗余计算分配给工人,可以实现编码的分布式计算技术。在本文中,我们提出了一种新颖的梯度编码(GC)方案,该方案利用了用GC-DC表示的动态聚类来加快梯度计算。在与时期相关的散布行为下,GC-DC旨在根据上一个迭代中的散布行为来调节每个集群中的散落工人的数量。我们从数值上表明,与原始GC方案相比,GC-DC在每次迭代的平均完成时间(每次迭代)的平均完成时间(每次迭代的)没有增加。

In distributed synchronous gradient descent (GD) the main performance bottleneck for the per-iteration completion time is the slowest \textit{straggling} workers. To speed up GD iterations in the presence of stragglers, coded distributed computation techniques are implemented by assigning redundant computations to workers. In this paper, we propose a novel gradient coding (GC) scheme that utilizes dynamic clustering, denoted by GC-DC, to speed up the gradient calculation. Under time-correlated straggling behavior, GC-DC aims at regulating the number of straggling workers in each cluster based on the straggler behavior in the previous iteration. We numerically show that GC-DC provides significant improvements in the average completion time (of each iteration) with no increase in the communication load compared to the original GC scheme.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源