论文标题

使用不对称数字系统(ANS)了解熵编码:统计学家的观点

Understanding Entropy Coding With Asymmetric Numeral Systems (ANS): a Statistician's Perspective

论文作者

Bamler, Robert

论文摘要

熵编码是骨干数据压缩。新型基于机器的压缩方法通常使用一种称为非对称数字系统(ANS)的新熵编码器[Duda等,2015],该编码非常接近最佳的比特率并简化了[Townsend等,2019]高级压缩技术,例如Bits-bick-back编码。但是,机器学习背景的研究人员通常难以了解ANS的工作原理,从而阻止他们利用其全部多功能性。本文是一种教育资源,可以通过从潜在变量模型的新角度和所谓的BITS-BACK TRICK展示来使ANS更容易接近。我们将逐步指导读者以Python编程语言的完整实现ANS,然后将其推广到更高级的用例中。我们还介绍并经验评估了为研究和生产使用设计的各种熵编码器的开源库。相关的教学视频和问题集可在线提供。

Entropy coding is the backbone data compression. Novel machine-learning based compression methods often use a new entropy coder called Asymmetric Numeral Systems (ANS) [Duda et al., 2015], which provides very close to optimal bitrates and simplifies [Townsend et al., 2019] advanced compression techniques such as bits-back coding. However, researchers with a background in machine learning often struggle to understand how ANS works, which prevents them from exploiting its full versatility. This paper is meant as an educational resource to make ANS more approachable by presenting it from a new perspective of latent variable models and the so-called bits-back trick. We guide the reader step by step to a complete implementation of ANS in the Python programming language, which we then generalize for more advanced use cases. We also present and empirically evaluate an open-source library of various entropy coders designed for both research and production use. Related teaching videos and problem sets are available online.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源