论文标题
与隐私的近乎最佳相关性聚类
Near-Optimal Correlation Clustering with Privacy
论文作者
论文摘要
相关聚类是无监督学习的核心问题,其应用程序涵盖了社区检测,重复检测,自动标签等等。在相关聚类问题中,一个人接收到输入一组节点,对于每个节点,一个共聚类偏好列表,目标是输出聚类,以最大程度地减少与指定节点的偏好的分歧。在本文中,我们引入了一种简单且在计算上有效的算法,用于与可证明的隐私保证的相关聚类问题。我们的近似保证金比先前的工作中显示的要强,并且是对数因素的最佳选择。
Correlation clustering is a central problem in unsupervised learning, with applications spanning community detection, duplicate detection, automated labelling and many more. In the correlation clustering problem one receives as input a set of nodes and for each node a list of co-clustering preferences, and the goal is to output a clustering that minimizes the disagreement with the specified nodes' preferences. In this paper, we introduce a simple and computationally efficient algorithm for the correlation clustering problem with provable privacy guarantees. Our approximation guarantees are stronger than those shown in prior work and are optimal up to logarithmic factors.