FedAt：具有异步层的高性能和沟通效率的联合学习系统

论文标题

FedAt：具有异步层的高性能和沟通效率的联合学习系统

FedAT: A High-Performance and Communication-Efficient Federated Learning System with Asynchronous Tiers

论文作者

Chai, Zheng, Chen, Yujing, Anwar, Ali, Zhao, Liang, Cheng, Yue, Rangwala, Huzefa

论文摘要

联合学习（FL）涉及对大量分布式设备进行训练，同时保持培训数据的本地化。这种合作学习形式在模型收敛速度，模型准确性，跨客户的平衡以及沟通成本中揭示了新的折衷，包括：（1）Straggler问题，由于数据或（计算和网络）资源异质性以及（2）沟通量的顾客引起的客户端滞后，并在其中大量的客户与中央服务器和Bottsleneck交流了大量客户。许多现有的FL方法侧重于沿权衡空间的一个维度进行优化。现有的解决方案使用异步模型更新或基于层次的同步机制来解决Straggler问题。但是，异步方法可以轻松地创建网络通信瓶颈，而分层可能会引入偏见，因为分层偏爱更快的层，并较短的响应潜伏期。为了解决这些问题，我们提出了FedAt，这是一种新型联邦学习方法，在非I.I.D.i.i.i.i.d.i.d.数据。 FedAt协同结合了同步层内训练和异步跨层训练。通过通过分层桥接同步和异步训练，通过提高收敛速度和测试准确性来最大程度地减少Straggler效应。 FedAt使用散乱的，加权的启发式启发式启发式，以取得并平衡训练以进一步的准确性提高。 FedAt使用有效的，基于多线编码的压缩算法压缩上行链路和下行链路通信，从而最大程度地降低了通信成本。结果表明，与最先进的FL方法相比，FedAt将预测性能提高了21.09％，并将通信成本降低到8.5倍。

Federated learning (FL) involves training a model over massive distributed devices, while keeping the training data localized. This form of collaborative learning exposes new tradeoffs among model convergence speed, model accuracy, balance across clients, and communication cost, with new challenges including: (1) straggler problem, where the clients lag due to data or (computing and network) resource heterogeneity, and (2) communication bottleneck, where a large number of clients communicate their local updates to a central server and bottleneck the server. Many existing FL methods focus on optimizing along only one dimension of the tradeoff space. Existing solutions use asynchronous model updating or tiering-based synchronous mechanisms to tackle the straggler problem. However, the asynchronous methods can easily create a network communication bottleneck, while tiering may introduce biases as tiering favors faster tiers with shorter response latencies. To address these issues, we present FedAT, a novel Federated learning method with Asynchronous Tiers under Non-i.i.d. data. FedAT synergistically combines synchronous intra-tier training and asynchronous cross-tier training. By bridging the synchronous and asynchronous training through tiering, FedAT minimizes the straggler effect with improved convergence speed and test accuracy. FedAT uses a straggler-aware, weighted aggregation heuristic to steer and balance the training for further accuracy improvement. FedAT compresses the uplink and downlink communications using an efficient, polyline-encoding-based compression algorithm, therefore minimizing the communication cost. Results show that FedAT improves the prediction performance by up to 21.09%, and reduces the communication cost by up to 8.5x, compared to state-of-the-art FL methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题