论文标题
通过DNN瓶颈增强捍卫对抗性例子
Defending Adversarial Examples via DNN Bottleneck Reinforcement
论文作者
论文摘要
本文提出了一种DNN瓶颈增强计划,以减轻深神经网络(DNN)针对对抗性攻击的脆弱性。典型的DNN分类器将输入图像编码为更适合推断的压缩潜在表示。此信息瓶颈使图像中特定图像的结构和特定于类的信息之间进行了权衡。通过在维护后者的同时加强前者,应从潜在表示中删除任何冗余信息,无论是否对抗。因此,本文提议共同训练自动编码器(AE)与视觉分类器共享相同的编码权重。为了加强信息瓶颈,我们介绍了多尺度的低通目标和多尺度的高频通信,以在网络中更好地转向频率转向。与现有方法不同,我们的计划是第一个改革的防御本身,它可以使分类器结构保持不变,而无需附加任何预处理头部,并且仅接受干净的图像进行培训。关于MNIST,CIFAR-10和ImageNet的广泛实验表明,我们对各种对抗性攻击的强烈防御。
This paper presents a DNN bottleneck reinforcement scheme to alleviate the vulnerability of Deep Neural Networks (DNN) against adversarial attacks. Typical DNN classifiers encode the input image into a compressed latent representation more suitable for inference. This information bottleneck makes a trade-off between the image-specific structure and class-specific information in an image. By reinforcing the former while maintaining the latter, any redundant information, be it adversarial or not, should be removed from the latent representation. Hence, this paper proposes to jointly train an auto-encoder (AE) sharing the same encoding weights with the visual classifier. In order to reinforce the information bottleneck, we introduce the multi-scale low-pass objective and multi-scale high-frequency communication for better frequency steering in the network. Unlike existing approaches, our scheme is the first reforming defense per se which keeps the classifier structure untouched without appending any pre-processing head and is trained with clean images only. Extensive experiments on MNIST, CIFAR-10 and ImageNet demonstrate the strong defense of our method against various adversarial attacks.