论文标题

主动贝叶斯评估黑盒分类器

Active Bayesian Assessment for Black-Box Classifiers

论文作者

Ji, Disi, Logan IV, Robert L., Smyth, Padhraic, Steyvers, Mark

论文摘要

机器学习的最新进展导致黑盒分类器在各种应用程序中的部署增加。在许多这样的情况下,迫切需要可靠地评估这些预训练模型的性能,并以标签有效的方式进行评估(鉴于标签可能稀缺且昂贵)。在本文中,我们引入了一种主动的贝叶斯方法,以评估分类器性能,以满足可靠性和标签效率的绝望。我们首先制定推理策略来量化常见评估指标的不确定性,例如准确性,错误分类成本和校准误差。然后,我们建议使用推断的不确定性来指导标记实例的有效选择,以实现较少的标签评估,以指导有效的实例选择实例。我们通过一系列系统的经验实验从我们提出的主动贝叶斯方法中获得了重大收益,该实验评估了几个标准图像和文本分类数据集对现代神经分类器(例如Resnet和Bert)的性能(例如,重新连接和BERT)。

Recent advances in machine learning have led to increased deployment of black-box classifiers across a wide variety of applications. In many such situations there is a critical need to both reliably assess the performance of these pre-trained models and to perform this assessment in a label-efficient manner (given that labels may be scarce and costly to collect). In this paper, we introduce an active Bayesian approach for assessment of classifier performance to satisfy the desiderata of both reliability and label-efficiency. We begin by developing inference strategies to quantify uncertainty for common assessment metrics such as accuracy, misclassification cost, and calibration error. We then propose a general framework for active Bayesian assessment using inferred uncertainty to guide efficient selection of instances for labeling, enabling better performance assessment with fewer labels. We demonstrate significant gains from our proposed active Bayesian approach via a series of systematic empirical experiments assessing the performance of modern neural classifiers (e.g., ResNet and BERT) on several standard image and text classification datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源