分散学习

论文标题

Data-heterogeneity-aware Mixing for Decentralized Learning

论文作者

Dandi, Yatin, Koloskova, Anastasia, Jaggi, Martin, Stich, Sebastian U.

论文摘要

分散的学习提供了一个有效的框架，可以通过在任意通信图上分布的数据来训练机器学习模型。但是，大多数现有的分散学习方法无视数据异质性与图形拓扑之间的相互作用。在本文中，我们表征了收敛对图的混合权重与跨节点的数据异质性之间关系的依赖性。我们提出了一个量化图形混合当前梯度的能力的度量。我们进一步证明了该度量可以控制收敛速率，尤其是在节点跨节点的异质性主导给定节点更新之间的随机性的设置。通过我们的分析，我们提出了一种方法，该方法使用标准凸的约束优化和草图技术定期有效地优化了度量。通过对标准计算机视觉和NLP基准测试的全面实验，我们表明我们的方法可改善各种任务的测试性能。

Decentralized learning provides an effective framework to train machine learning models with data distributed over arbitrary communication graphs. However, most existing approaches toward decentralized learning disregard the interaction between data heterogeneity and graph topology. In this paper, we characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes. We propose a metric that quantifies the ability of a graph to mix the current gradients. We further prove that the metric controls the convergence rate, particularly in settings where the heterogeneity across nodes dominates the stochasticity between updates for a given node. Motivated by our analysis, we propose an approach that periodically and efficiently optimizes the metric using standard convex constrained optimization and sketching techniques. Through comprehensive experiments on standard computer vision and NLP benchmarks, we show that our approach leads to improvement in test performance for a wide range of tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题

分散学习

Data-heterogeneity-aware Mixing for Decentralized Learning

论文作者

论文摘要

加入微信交流群