基于层次残差学习的矢量量化图像重建和生成的变分自动编码器

论文标题

基于层次残差学习的矢量量化图像重建和生成的变分自动编码器

Hierarchical Residual Learning Based Vector Quantized Variational Autoencoder for Image Reconstruction and Generation

论文作者

Adiban, Mohammad, Stefanov, Kalin, Siniscalchi, Sabato Marco, Salvi, Giampiero

论文摘要

我们提出了一种多层变量自动编码器方法，我们称为HR-VQVAE，该方法学习数据的层次离散表示。通过利用新型的目标函数，HR-VQVAE中的每个层都通过量化的编码来学习从以前的层中的残差表示离散表示。此外，每一层的表示形式在层次上链接到以前的图层。我们评估了图像重建和生成任务的方法。实验结果表明，由HR-VQVAE学到的离散表示，使解码器能够比基线方法（即VQVAE和VQVAE-2）重建具有较小的变形的高质量图像。 HR-VQVAE还可以产生优于最先进的生成模型的高质量和多样化的图像，从而进一步验证学习表示的效率。 HR-VQVAE的层次结构性质i）减少了解码时间，使该方法特别适合于高负载任务，ii）允许增加代码簿的大小而不会引起代码书倒塌问题。

We propose a multi-layer variational autoencoder method, we call HR-VQVAE, that learns hierarchical discrete representations of the data. By utilizing a novel objective function, each layer in HR-VQVAE learns a discrete representation of the residual from previous layers through a vector quantized encoder. Furthermore, the representations at each layer are hierarchically linked to those at previous layers. We evaluate our method on the tasks of image reconstruction and generation. Experimental results demonstrate that the discrete representations learned by HR-VQVAE enable the decoder to reconstruct high-quality images with less distortion than the baseline methods, namely VQVAE and VQVAE-2. HR-VQVAE can also generate high-quality and diverse images that outperform state-of-the-art generative models, providing further verification of the efficiency of the learned representations. The hierarchical nature of HR-VQVAE i) reduces the decoding search time, making the method particularly suitable for high-load tasks and ii) allows to increase the codebook size without incurring the codebook collapse problem.

下载PDF全文

下载文献需遵守相关版权规定

论文标题