论文标题
HETVIS:一种视觉分析方法,用于识别水平联合学习中数据异质性
HetVis: A Visual Analysis Approach for Identifying Data Heterogeneity in Horizontal Federated Learning
论文作者
论文摘要
横向联合学习(HFL)使分布式客户能够培训共享模型并保持其数据隐私。在培训高质量的HFL模型中,客户之间的数据异质性是主要问题之一。但是,由于安全问题和深度学习模型的复杂性,研究不同客户的数据异质性是一项挑战。为了解决此问题,基于需求分析,我们开发了一种视觉分析工具Hetvis,以供参与者探索数据异质性。我们通过比较全球联合模型的预测行为和经过本地数据训练的独立模型来确定数据异质性。然后,完成了不一致记录的上下文感知的聚类,以提供数据异质性的摘要。结合提出的比较技术,我们开发了一组新型的可视化,以识别HFL中的异质性问题。我们设计了三个案例研究,以介绍HETVI如何帮助客户分析师了解不同类型的异质性问题。专家评论和比较研究证明了Hetvis的有效性。
Horizontal federated learning (HFL) enables distributed clients to train a shared model and keep their data privacy. In training high-quality HFL models, the data heterogeneity among clients is one of the major concerns. However, due to the security issue and the complexity of deep learning models, it is challenging to investigate data heterogeneity across different clients. To address this issue, based on a requirement analysis we developed a visual analytics tool, HetVis, for participating clients to explore data heterogeneity. We identify data heterogeneity through comparing prediction behaviors of the global federated model and the stand-alone model trained with local data. Then, a context-aware clustering of the inconsistent records is done, to provide a summary of data heterogeneity. Combining with the proposed comparison techniques, we develop a novel set of visualizations to identify heterogeneity issues in HFL. We designed three case studies to introduce how HetVis can assist client analysts in understanding different types of heterogeneity issues. Expert reviews and a comparative study demonstrate the effectiveness of HetVis.