论文标题
邻居的预测信心
Prediction Confidence from Neighbors
论文作者
论文摘要
机器学习(ML)模型无法成功地推断出从分布(OOD)样本中推断正确的预测,这是对ML在关键应用中应用的主要障碍。在提高ML方法的概括能力之前,有必要将人类保持在循环中。只有在确定对预测的信心水平时,才能减少对人类监督的需求,该预测可以用来寻求人类援助或放弃进行预测。我们表明,特征空间距离是一种有意义的措施,可以对预测提供信心。事实证明,看不见的样品与附近训练样本之间的距离与看不见的样本的预测误差相关。根据可接受的错误程度,可以根据训练样本的距离来信任或拒绝预测。另外,可以使用新颖的阈值来确定样品是否值得添加到训练集中。这可以使模型在关键应用程序中的早期和更安全的部署,对于在不断变化的条件下部署模型至关重要。
The inability of Machine Learning (ML) models to successfully extrapolate correct predictions from out-of-distribution (OoD) samples is a major hindrance to the application of ML in critical applications. Until the generalization ability of ML methods is improved it is necessary to keep humans in the loop. The need for human supervision can only be reduced if it is possible to determining a level of confidence in predictions, which can be used to either ask for human assistance or to abstain from making predictions. We show that feature space distance is a meaningful measure that can provide confidence in predictions. The distance between unseen samples and nearby training samples proves to be correlated to the prediction error of unseen samples. Depending on the acceptable degree of error, predictions can either be trusted or rejected based on the distance to training samples. %Additionally, a novelty threshold can be used to decide whether a sample is worth adding to the training set. This enables earlier and safer deployment of models in critical applications and is vital for deploying models under ever-changing conditions.