分布式亚级别方法的渐近网络独立性和步进大小

论文标题

分布式亚级别方法的渐近网络独立性和步进大小

Asymptotic Network Independence and Step-Size for A Distributed Subgradient Method

论文作者

Olshevsky, Alex

论文摘要

我们考虑分布式亚级别方法是否可以在集中式亚级别方法上实现线性加速。虽然可能希望与单个节点相比，可以平行计算$ n $ nodes的分布式网络，因此，与单个节点相比，可能会更快$ n $倍，但与单个节点相比，分布式优化方法的现有范围通常与速度相一致。我们表明，当使用一类Square-ummable-not-ummummable的台阶尺寸时，分布式亚级别方法具有此“线性加速”属性，其中包括$ 1/t^β$当$ 1/t^β$当$β\ in（1/2,1）$;对于这样的步骤尺寸，我们表明，在瞬态周期之后，大小取决于网络的频谱间隙，该方法实现了不依赖网络或节点数量的性能保证。我们还表明，在最佳衰减的阶梯尺寸$ 1/\ sqrt {t} $下，相同的方法可能无法具有这种“渐近网络独立性”属性，因此，与$ 1/\ sqrt {t} $ stepsize相比，与单个节点相比，与单个节点相比，无法提供线性加速。

We consider whether distributed subgradient methods can achieve a linear speedup over a centralized subgradient method. While it might be hoped that distributed network of $n$ nodes that can compute $n$ times more subgradients in parallel compared to a single node might, as a result, be $n$ times faster, existing bounds for distributed optimization methods are often consistent with a slowdown rather than speedup compared to a single node. We show that a distributed subgradient method has this "linear speedup" property when using a class of square-summable-but-not-summable step-sizes which include $1/t^β$ when $β\in (1/2,1)$; for such step-sizes, we show that after a transient period whose size depends on the spectral gap of the network, the method achieves a performance guarantee that does not depend on the network or the number of nodes. We also show that the same method can fail to have this "asymptotic network independence" property under the optimally decaying step-size $1/\sqrt{t}$ and, as a consequence, can fail to provide a linear speedup compared to a single node with $1/\sqrt{t}$ step-size.

下载PDF全文

下载文献需遵守相关版权规定

论文标题