私人估计与公共数据

论文标题

私人估计与公共数据

Private Estimation with Public Data

论文作者

Bie, Alex, Kamath, Gautam, Singhal, Vikrant

论文摘要

我们启动差异私有（DP）估计的研究，并访问少量公共数据。为了对D维高斯人进行私人估计，我们假设公共数据来自一个高斯，该高斯与私人数据的基础高斯人的总变化距离可能消失了。我们表明，在纯或集中DP的限制下，D+1个公共数据样本足以从私人样本复杂性中删除对私人数据分布的范围参数的任何依赖性，而众所周知，在没有公共数据的情况下，这是必不可少的。对于分离的高斯混合物，我们假设基本的公共和私人分布是相同的，并且我们考虑两个设置：（1）当给出了独立于维度的公共数据量时，私有样品复杂性可以通过混合组件的数量进行多项性改善，并且对分布范围的任何依赖性都可以在近似情况下删除。（2）如果在维度上给出了一定数量的公共数据线性，即使在集中的DP下，私有样本复杂性也可以独立于范围参数，并且可以对整体样本复杂性进行其他改进。

We initiate the study of differentially private (DP) estimation with access to a small amount of public data. For private estimation of d-dimensional Gaussians, we assume that the public data comes from a Gaussian that may have vanishing similarity in total variation distance with the underlying Gaussian of the private data. We show that under the constraints of pure or concentrated DP, d+1 public data samples are sufficient to remove any dependence on the range parameters of the private data distribution from the private sample complexity, which is known to be otherwise necessary without public data. For separated Gaussian mixtures, we assume that the underlying public and private distributions are the same, and we consider two settings: (1) when given a dimension-independent amount of public data, the private sample complexity can be improved polynomially in terms of the number of mixture components, and any dependence on the range parameters of the distribution can be removed in the approximate DP case; (2) when given an amount of public data linear in the dimension, the private sample complexity can be made independent of range parameters even under concentrated DP, and additional improvements can be made to the overall sample complexity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题