论文标题
局部最佳检测随机靶向通用对抗扰动
Locally optimal detection of stochastic targeted universal adversarial perturbations
论文作者
论文摘要
已知深度学习图像分类器容易受到输入图像的小对抗扰动的影响。在本文中,我们得出了基于局部最佳的一般性似然比检测(LO-GLRT)检测器,用于检测分类器输入的随机靶向通用逆向扰动(UAP)。我们还描述了一种监督培训方法,以学习检测器的参数,并与几个流行的图像分类数据集中的其他检测方法相比,证明检测器的性能更好。
Deep learning image classifiers are known to be vulnerable to small adversarial perturbations of input images. In this paper, we derive the locally optimal generalized likelihood ratio test (LO-GLRT) based detector for detecting stochastic targeted universal adversarial perturbations (UAPs) of the classifier inputs. We also describe a supervised training method to learn the detector's parameters, and demonstrate better performance of the detector compared to other detection methods on several popular image classification datasets.