稀疏PCA：算法，对抗扰动和证书

论文标题

稀疏PCA：算法，对抗扰动和证书

Sparse PCA: Algorithms, Adversarial Perturbations and Certificates

论文作者

d'Orsi, Tommaso, Kothari, Pravesh K., Novikov, Gleb, Steurer, David

论文摘要

我们研究了标准统计模型中稀疏PCA的有效算法（其WishArt形式的加标协方差）。我们的目标是实现最佳的恢复保证，同时对小扰动有抵抗力。尽管先前的工作历史悠久，包括对扰动弹性的明确研究，但稀疏PCA的最著名算法保证是脆弱的，在小的对抗性扰动下破裂。我们观察到扰动弹性与\ emph {认证算法}之间的基本联系，该算法基于随机矩阵稀疏特征值的上限证书。与其他技术相反，这种认证算法（包括蛮力最大似然估计器）自动耐心地抵抗小型对抗性扰动。我们使用此连接来获得该问题的第一个多项式时间算法，该算法通过在随机矩阵稀疏特征值上获得上限的新有效证书来抵御添加剂对抗扰动。我们的算法基于基本的半限定编程，或者根据参数制度取决于其低度和平方的加强总和。他们的保证是根据未知向量的稀疏性，样本数量和环境维度的稀疏性匹配或接近\ emph {脆弱}算法的最著名保证。为了补充我们的算法结果，我们证明了严格的下限与基于低度多项式的自然计算模型中的脆弱和稳健多项式时算法匹配（与伪核算技术相关的伪级量化技术密切相关），以捕获相关统计学的估算，这是众所周知的。这些结果的结合提供了正式的证据，证明要付出固有的价格以实现鲁棒性。

We study efficient algorithms for Sparse PCA in standard statistical models (spiked covariance in its Wishart form). Our goal is to achieve optimal recovery guarantees while being resilient to small perturbations. Despite a long history of prior works, including explicit studies of perturbation resilience, the best known algorithmic guarantees for Sparse PCA are fragile and break down under small adversarial perturbations. We observe a basic connection between perturbation resilience and \emph{certifying algorithms} that are based on certificates of upper bounds on sparse eigenvalues of random matrices. In contrast to other techniques, such certifying algorithms, including the brute-force maximum likelihood estimator, are automatically robust against small adversarial perturbation. We use this connection to obtain the first polynomial-time algorithms for this problem that are resilient against additive adversarial perturbations by obtaining new efficient certificates for upper bounds on sparse eigenvalues of random matrices. Our algorithms are based either on basic semidefinite programming or on its low-degree sum-of-squares strengthening depending on the parameter regimes. Their guarantees either match or approach the best known guarantees of \emph{fragile} algorithms in terms of sparsity of the unknown vector, number of samples and the ambient dimension. To complement our algorithmic results, we prove rigorous lower bounds matching the gap between fragile and robust polynomial-time algorithms in a natural computational model based on low-degree polynomials (closely related to the pseudo-calibration technique for sum-of-squares lower bounds) that is known to capture the best known guarantees for related statistical estimation problems. The combination of these results provides formal evidence of an inherent price to pay to achieve robustness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题