论文标题
量化神经网络的简单方法
A simple approach for quantizing neural networks
论文作者
论文摘要
在此简短说明中,我们提出了一种量化全面训练神经网络的权重的新方法。一个简单的确定性预处理步骤使我们能够通过无内存标量量化量化网络层,同时在给定的培训数据上保留网络性能。一方面,此预处理的计算复杂性略微超过了文献中最先进的算法。另一方面,我们的方法不需要任何高参数调整,与以前的方法相反,可以进行简单的分析。在量化单个网络层的情况下,我们提供了严格的理论保证,并表明如果训练数据的行为良好,例如,如果它是从合适的随机分布中取样的,则相对误差会随网络中的参数数量衰减。开发的方法还易于通过连续应用到单层来量化深网。
In this short note, we propose a new method for quantizing the weights of a fully trained neural network. A simple deterministic pre-processing step allows us to quantize network layers via memoryless scalar quantization while preserving the network performance on given training data. On one hand, the computational complexity of this pre-processing slightly exceeds that of state-of-the-art algorithms in the literature. On the other hand, our approach does not require any hyper-parameter tuning and, in contrast to previous methods, allows a plain analysis. We provide rigorous theoretical guarantees in the case of quantizing single network layers and show that the relative error decays with the number of parameters in the network if the training data behaves well, e.g., if it is sampled from suitable random distributions. The developed method also readily allows the quantization of deep networks by consecutive application to single layers.