神经网络模型的隐态提取

论文标题

神经网络模型的隐态提取

Cryptanalytic Extraction of Neural Network Models

论文作者

Carlini, Nicholas, Jagielski, Matthew, Mironov, Ilya

论文摘要

我们认为，模型提取的机器学习问题实际上是一个伪装中的隐性问题，应该这样研究。给定的Oracle访问神经网络，我们引入了差异攻击，该攻击可以有效地窃取远程模型的参数，直至浮点精度。我们的攻击依赖于Relu神经网络是分段线性函数的事实，因此在关键点查询揭示了有关模型参数的信息。我们评估了对多个神经网络模型的攻击，并提取了精确的2倍20倍的模型，比先前的工作要少100倍。例如，我们提取一个100,000个参数神经网络，该网络对MNIST数字识别任务进行了培训，其中一个小时内有2^21.5的查询，因此提取的模型在所有输入中都与Oracle一致，最高案例误差为2^-25，或者在2^18.5 Queries中具有4,000个型号，其中2^18.5 Querite is wor Worst-Case cases erst 2^-4。代码可从https://github.com/google-research/cryptanalytalit-model-rattraction获得。

We argue that the machine learning problem of model extraction is actually a cryptanalytic problem in disguise, and should be studied as such. Given oracle access to a neural network, we introduce a differential attack that can efficiently steal the parameters of the remote model up to floating point precision. Our attack relies on the fact that ReLU neural networks are piecewise linear functions, and thus queries at the critical points reveal information about the model parameters. We evaluate our attack on multiple neural network models and extract models that are 2^20 times more precise and require 100x fewer queries than prior work. For example, we extract a 100,000 parameter neural network trained on the MNIST digit recognition task with 2^21.5 queries in under an hour, such that the extracted model agrees with the oracle on all inputs up to a worst-case error of 2^-25, or a model with 4,000 parameters in 2^18.5 queries with worst-case error of 2^-40.4. Code is available at https://github.com/google-research/cryptanalytic-model-extraction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题