论文标题

iflipper:标签为个人公平而翻转

iFlipper: Label Flipping for Individual Fairness

论文作者

Zhang, Hantian, Tae, Ki Hyun, Park, Jaeyoung, Chu, Xu, Whang, Steven Euijong

论文摘要

随着机器学习的普遍性,减轻培训数据中存在的任何不公平性变得至关重要。在公平的各种概念中,本文的重点是众所周知的个人公平,该公平规定应该对类似的人进行类似的对待。虽然在训练模型(进行过程中)时可以提高个人公平性,但我们认为,在模型培训(预处理)之前修复数据是一个更基本的解决方案。特别是,我们表明标签翻转是一种有效的预处理技术,可改善个人公平性。我们的系统IFLIPPER解决了限制了个人公平性违规行为的最小翻转标签的优化问题,当培训数据中的两个类似示例具有不同的标签时,发生违规情况。我们首先证明问题是NP-HARD。然后,我们提出了一种近似的线性编程算法,并提供理论保证其结果与标签翻转的数量有关的结果与最佳解决方案有多近。我们还提出了使线性编程解决方案更加最佳的技术,而不会超过违规限制。实际数据集上的实验表明,Iflipper在看不见的测试集上的个人公平和准确性方面明显优于其他预处理基线。此外,IFLIPPER可以与处理中的技术结合使用,以获得更好的结果。

As machine learning becomes prevalent, mitigating any unfairness present in the training data becomes critical. Among the various notions of fairness, this paper focuses on the well-known individual fairness, which states that similar individuals should be treated similarly. While individual fairness can be improved when training a model (in-processing), we contend that fixing the data before model training (pre-processing) is a more fundamental solution. In particular, we show that label flipping is an effective pre-processing technique for improving individual fairness. Our system iFlipper solves the optimization problem of minimally flipping labels given a limit to the individual fairness violations, where a violation occurs when two similar examples in the training data have different labels. We first prove that the problem is NP-hard. We then propose an approximate linear programming algorithm and provide theoretical guarantees on how close its result is to the optimal solution in terms of the number of label flips. We also propose techniques for making the linear programming solution more optimal without exceeding the violations limit. Experiments on real datasets show that iFlipper significantly outperforms other pre-processing baselines in terms of individual fairness and accuracy on unseen test sets. In addition, iFlipper can be combined with in-processing techniques for even better results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源