使用可信赖的解释制作公平的ML软件

论文标题

使用可信赖的解释制作公平的ML软件

Making Fair ML Software using Trustworthy Explanation

论文作者

Chakraborty, Joymallya, Peng, Kewen, Menzies, Tim

论文摘要

机器学习软件已用于许多应用程序（金融，招聘，招生，刑事司法）中，具有巨大的社会影响。但是有时该软件的行为是有偏见的，并且基于某些敏感属性（例如性别，种族等）显示出歧视。先前的工作集中于查找和减轻ML模型中的偏见。最近的一个趋势是使用基于实例的模型不足的解释方法（例如石灰）来找出模型预测中的偏差。我们的工作集中在寻找当前偏见措施和解释方法的缺点。我们展示了我们提出的基于K最近的邻居的建议方法可以克服这些缺点并找到黑盒模型的潜在偏差。我们的结果对从业者更值得信赖和帮助。最后，我们描述了我们的未来框架结合了解释和计划建立公平软件。

Machine learning software is being used in many applications (finance, hiring, admissions, criminal justice) having a huge social impact. But sometimes the behavior of this software is biased and it shows discrimination based on some sensitive attributes such as sex, race, etc. Prior works concentrated on finding and mitigating bias in ML models. A recent trend is using instance-based model-agnostic explanation methods such as LIME to find out bias in the model prediction. Our work concentrates on finding shortcomings of current bias measures and explanation methods. We show how our proposed method based on K nearest neighbors can overcome those shortcomings and find the underlying bias of black-box models. Our results are more trustworthy and helpful for the practitioners. Finally, We describe our future framework combining explanation and planning to build fair software.

下载PDF全文

下载文献需遵守相关版权规定

论文标题