通用对手方向

论文标题

通用对手方向

Universal Adversarial Directions

论文作者

Choi, Ching Lam, Farnia, Farzan

论文摘要

尽管在图像识别任务上取得了巨大的成功，但已经观察到深层神经网络（DNN）容易受到通用对抗扰动（UAPS）的影响，这些扰动（UAPS）扰乱了所有带有单个扰动矢量的输入样品。但是，UAP经常在转移DNN体系结构并导致具有挑战性的优化问题方面难以进行。在这项工作中，我们通过分析分类器和UAP对手玩家之间的通用对抗示例游戏中的平衡来研究UAP的可传递性。我们表明，在温和的假设下，通用的对抗示例游戏缺乏纯净的NASH平衡，表明UAPS在DNN分类器中的次优可传递性。为了解决这个问题，我们提出了普遍的对抗方向（UADS），该方向仅修复了对抗扰动的通用方向，并允许在样本中自由选择扰动的幅度。我们证明，UAD对抗示例游戏可以具有纯UAD策略的NASH平衡，这意味着UAD的潜在可传递性。我们还将UAD优化问题连接到众所周知的主组件分析（PCA），并开发了一种基于PCA的有效算法来优化UAD。我们通过多个基准图像数据集评估UAD。我们的数值结果表明，UAD比基于标准梯度的UAP的可传递性优越。

Despite their great success in image recognition tasks, deep neural networks (DNNs) have been observed to be susceptible to universal adversarial perturbations (UAPs) which perturb all input samples with a single perturbation vector. However, UAPs often struggle in transferring across DNN architectures and lead to challenging optimization problems. In this work, we study the transferability of UAPs by analyzing equilibrium in the universal adversarial example game between the classifier and UAP adversary players. We show that under mild assumptions the universal adversarial example game lacks a pure Nash equilibrium, indicating UAPs' suboptimal transferability across DNN classifiers. To address this issue, we propose Universal Adversarial Directions (UADs) which only fix a universal direction for adversarial perturbations and allow the perturbations' magnitude to be chosen freely across samples. We prove that the UAD adversarial example game can possess a Nash equilibrium with a pure UAD strategy, implying the potential transferability of UADs. We also connect the UAD optimization problem to the well-known principal component analysis (PCA) and develop an efficient PCA-based algorithm for optimizing UADs. We evaluate UADs over multiple benchmark image datasets. Our numerical results show the superior transferability of UADs over standard gradient-based UAPs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题