论文标题
双向通用对抗扰动
Double Targeted Universal Adversarial Perturbations
论文作者
论文摘要
尽管表现令人印象深刻,但深处的神经网络(DNN)还是众所周知的,容易受到对抗性攻击的影响,这使得他们在对安全敏感的应用程序中部署(例如自动驾驶)的挑战。与图像有关的扰动可以欺骗一个特定图像的网络,而通用的对抗扰动能够为来自所有类的样本而无需选择。我们引入了双向靶向的对抗扰动(DT-UAP),以弥合实例 - 歧义图像依赖性扰动与通用通用扰动之间的差距。这种通用的扰动攻击一个有针对性的源类以降低类,同时对其他非目标源类具有有限的对抗性效应,以避免引起怀疑。同时针对源和下沉类,我们将其称为双向攻击(DTA)。这为攻击者提供了对DNN模型进行精确攻击的自由,同时引起了人们的怀疑。我们显示了所提出的DTA算法在各种数据集上的有效性,还证明了它作为物理攻击的潜力。
Despite their impressive performance, deep neural networks (DNNs) are widely known to be vulnerable to adversarial attacks, which makes it challenging for them to be deployed in security-sensitive applications, such as autonomous driving. Image-dependent perturbations can fool a network for one specific image, while universal adversarial perturbations are capable of fooling a network for samples from all classes without selection. We introduce a double targeted universal adversarial perturbations (DT-UAPs) to bridge the gap between the instance-discriminative image-dependent perturbations and the generic universal perturbations. This universal perturbation attacks one targeted source class to sink class, while having a limited adversarial effect on other non-targeted source classes, for avoiding raising suspicions. Targeting the source and sink class simultaneously, we term it double targeted attack (DTA). This provides an attacker with the freedom to perform precise attacks on a DNN model while raising little suspicion. We show the effectiveness of the proposed DTA algorithm on a wide range of datasets and also demonstrate its potential as a physical attack.