论文标题
改进了使用中间值的线性依赖性编码分布式计算的计算通信权衡取舍
Improved Computation-Communication Trade-Off for Coded Distributed Computing using Linear Dependence of Intermediate Values
论文作者
论文摘要
在大规模分布的计算系统中,通信开销是主要的瓶颈之一。在Map-Shuff-Reduce框架(这是主要的分布式计算框架之一)中,可以通过增加每个服务器的计算负载来减少服务器之间的通信负载,也就是说,计算负载和通信负载之间存在权衡。最近,已经显示,编码的分布式计算(CDC)通过让服务器编码其中间计算结果来改善这种权衡关系。原始的CDC方案在服务器计算的功能上不假定任何特殊结构。但是,在实际问题中,这些功能通常具有某些结构,并且使用该结构可以进一步改善权衡关系。在本文中,我们提出了一种新方案,该方案通过利用中间计算结果的线性依赖性结构进一步改善了权衡关系。在MAP阶段计算的中间值可以被视为$ \ Mathbb {F} _ {2} $上的向量。在某些应用中,这些中间值具有线性依赖性,在这种情况下,每个服务器都足以发送线性子空间和线性组合系数的基础。结果,在某些应用中,提出的方法改善了最著名的计算通信间接费用。
In large scale distributed computing systems, communication overhead is one of the major bottlenecks. In the map-shuffle-reduce framework, which is one of the major distributed computing frameworks, the communication load among servers can be reduced by increasing the computation load of each server, that is, there is a trade-off between computation load and communication load. Recently, it has been shown that coded distributed computing (CDC) improves this trade-off relationship by letting servers encode their intermediate computation results. The original CDC scheme does not assume any special structures on the functions that servers compute. However, in actual problems, these functions often have some structures, and the trade-off relation may be further improved by using that structures. In this paper, we propose a new scheme that further improves the trade-off relationship by utilizing the linear dependency structure of the intermediate computation results. The intermediate values computed in the map phase can be considered as vectors on $\mathbb{F}_{2}$. In some applications, these intermediate values have a linear dependency and in such cases, it is sufficient for each server to send a basis of the linear subspace and linear combination coefficients. As a result, the proposed approach improves over the best-known computation-communication overhead trade-off in some applications.