DPVIM：差异化差异推理改善了

论文标题

DPVIM：差异化差异推理改善了

DPVIm: Differentially Private Variational Inference Improved

论文作者

Jälkö, Joonas, Prediger, Lukas, Honkela, Antti, Kaski, Samuel

论文摘要

多维统计的差异性私有（DP）通常会考虑汇总灵敏度，例如高维矢量的矢量规范。但是，该载体的不同维度可能具有很大不同的幅度，因此DP扰动不成比例地影响跨维度的信号。我们在将其用于变异推理（VI）时在DP-SGD算法的梯度释放中观察到了这个问题，在该算法（VI）中，它表现出较差的收敛性以及对于某些变异参数的输出的较高差异，并做出以下贡献：（i）我们在数学上隔离了渐变级别的差异的原因。使用此作为先验知识，我们在变异参数的梯度之间建立了一个链接，并提出了一个有效的同时解决问题的简单解决方案，以获得较少的嘈杂梯度估计器，我们称之为$ \ textit {Aligned} $梯度。这种方法使我们能够在没有隐私成本的情况下获得高斯后近似值的协方差参数的更新。我们将其与使用分析得出的预处理（例如天然梯度。（ii）我们建议在训练期间恢复的DP参数轨迹上使用迭代平均，以减少参数估计中DP诱导的噪声，而无需额外的隐私成本。最后，（iii）为了准确捕获额外的不确定性DP引入模型参数，我们从参数痕迹中推断出DP诱导的噪声，并将其包括在学习的perterior中，以使其使其使其$ \ textIt {noise Away away} $。我们通过对实际数据的各种实验来证明我们提出的改进的功效。

Differentially private (DP) release of multidimensional statistics typically considers an aggregate sensitivity, e.g. the vector norm of a high-dimensional vector. However, different dimensions of that vector might have widely different magnitudes and therefore DP perturbation disproportionately affects the signal across dimensions. We observe this problem in the gradient release of the DP-SGD algorithm when using it for variational inference (VI), where it manifests in poor convergence as well as high variance in outputs for certain variational parameters, and make the following contributions: (i) We mathematically isolate the cause for the difference in magnitudes between gradient parts corresponding to different variational parameters. Using this as prior knowledge we establish a link between the gradients of the variational parameters, and propose an efficient while simple fix for the problem to obtain a less noisy gradient estimator, which we call $\textit{aligned}$ gradients. This approach allows us to obtain the updates for the covariance parameter of a Gaussian posterior approximation without a privacy cost. We compare this to alternative approaches for scaling the gradients using analytically derived preconditioning, e.g. natural gradients. (ii) We suggest using iterate averaging over the DP parameter traces recovered during the training, to reduce the DP-induced noise in parameter estimates at no additional cost in privacy. Finally, (iii) to accurately capture the additional uncertainty DP introduces to the model parameters, we infer the DP-induced noise from the parameter traces and include that in the learned posteriors to make them $\textit{noise aware}$. We demonstrate the efficacy of our proposed improvements through various experiments on real data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题