关于基于学习的错误探测器的分布变化

论文标题

关于基于学习的错误探测器的分布变化

On Distribution Shift in Learning-based Bug Detectors

论文作者

He, Jingxuan, Beurer-Kellner, Luca, Vechev, Martin

论文摘要

Deep learning has recently achieved initial success in program analysis tasks such as bug detection. Lacking real bugs, most existing works construct training and test data by injecting synthetic bugs into correct programs.尽管达到了高测试精度（例如90％），但实际上发现所得的错误检测器在实践中令人惊讶地无法使用，即用于扫描真实软件存储库时<10％的精度。在这项工作中，我们认为这种巨大的性能差异是由分布变化引起的，即，实际错误分布与用于训练和评估检测器的合成错误分布之间的基本不匹配。为了应对这一关键挑战，我们建议在两个阶段进行训练一个错误检测器，首先是合成错误分布，以使模型适应错误检测域，然后在真实的错误分布上将模型驱动到真实的分布中。 During these two phases, we leverage a multi-task hierarchy, focal loss, and contrastive learning to further boost performance.我们对三种经过广泛研究的错误类型进行了广泛的评估，为此，我们仔细设计了新的数据集，以捕获真正的错误分布。结果表明，我们的方法实际上是有效的，并且可以成功地减轻分配变化：我们学到的探测器在测试集和最新版本的开源存储库中都表现出色。 Our code, datasets, and models are publicly available at https://github.com/eth-sri/learning-real-bug-detector.

Deep learning has recently achieved initial success in program analysis tasks such as bug detection. Lacking real bugs, most existing works construct training and test data by injecting synthetic bugs into correct programs. Despite achieving high test accuracy (e.g., 90%), the resulting bug detectors are found to be surprisingly unusable in practice, i.e., <10% precision when used to scan real software repositories. In this work, we argue that this massive performance difference is caused by a distribution shift, i.e., a fundamental mismatch between the real bug distribution and the synthetic bug distribution used to train and evaluate the detectors. To address this key challenge, we propose to train a bug detector in two phases, first on a synthetic bug distribution to adapt the model to the bug detection domain, and then on a real bug distribution to drive the model towards the real distribution. During these two phases, we leverage a multi-task hierarchy, focal loss, and contrastive learning to further boost performance. We evaluate our approach extensively on three widely studied bug types, for which we construct new datasets carefully designed to capture the real bug distribution. The results demonstrate that our approach is practically effective and successfully mitigates the distribution shift: our learned detectors are highly performant on both our test set and the latest version of open source repositories. Our code, datasets, and models are publicly available at https://github.com/eth-sri/learning-real-bug-detector.

下载PDF全文

下载文献需遵守相关版权规定

论文标题