局部最佳检测随机靶向通用对抗扰动

论文标题

局部最佳检测随机靶向通用对抗扰动

Locally optimal detection of stochastic targeted universal adversarial perturbations

论文作者

Goel, Amish, Moulin, Pierre

论文摘要

已知深度学习图像分类器容易受到输入图像的小对抗扰动的影响。在本文中，我们得出了基于局部最佳的一般性似然比检测（LO-GLRT）检测器，用于检测分类器输入的随机靶向通用逆向扰动（UAP）。我们还描述了一种监督培训方法，以学习检测器的参数，并与几个流行的图像分类数据集中的其他检测方法相比，证明检测器的性能更好。

Deep learning image classifiers are known to be vulnerable to small adversarial perturbations of input images. In this paper, we derive the locally optimal generalized likelihood ratio test (LO-GLRT) based detector for detecting stochastic targeted universal adversarial perturbations (UAPs) of the classifier inputs. We also describe a supervised training method to learn the detector's parameters, and demonstrate better performance of the detector compared to other detection methods on several popular image classification datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题