基于对比度的非线性独立组件分析的有限样本可识别性

论文标题

基于对比度的非线性独立组件分析的有限样本可识别性

On Finite-Sample Identifiability of Contrastive Learning-Based Nonlinear Independent Component Analysis

论文作者

Lyu, Qi, Fu, Xiao

论文摘要

非线性独立组件分析（NICA）旨在恢复未知非线性函数混合的统计独立的潜在组件。 NICA的核心是潜在组件的可识别性，直到最近才难以捉摸。具体而言，Hyvärinen等人。鉴于潜在的组件是独立于某个辅助变量的条件，因此在广义对比度学习（GCL）配方中可识别出非线混合的潜在组件（通常是无关紧要的含糊性）。基于GCL的NICA可识别性是优雅的，并在表示学习，因果学习和因素分解方面建立了NICA与流行的无监督/自我监督学习范式之间的有趣联系。但是，NICA的现有可识别性分析都建立在一个无限的样本假设和理想的通用功能学习者的使用基础上 - 这在理论和实践之间造成了不可忽略的差距。缩小差距是一个非平凡的挑战，因为缺乏既定的``教科书''常规，用于对此类无监督问题的有限样本分析。这项工作对基于GCL的NICA进行了有限样本可识别性分析。我们的分析框架明智地结合了GCL损失函数，统计概括分析和数值分化的特性。我们的框架还考虑了学习函数的近似错误，并揭示了就业功能学习者的复杂性和表现力之间的直观权衡。数值实验用于验证定理。

Nonlinear independent component analysis (nICA) aims at recovering statistically independent latent components that are mixed by unknown nonlinear functions. Central to nICA is the identifiability of the latent components, which had been elusive until very recently. Specifically, Hyvärinen et al. have shown that the nonlinearly mixed latent components are identifiable (up to often inconsequential ambiguities) under a generalized contrastive learning (GCL) formulation, given that the latent components are independent conditioned on a certain auxiliary variable. The GCL-based identifiability of nICA is elegant, and establishes interesting connections between nICA and popular unsupervised/self-supervised learning paradigms in representation learning, causal learning, and factor disentanglement. However, existing identifiability analyses of nICA all build upon an unlimited sample assumption and the use of ideal universal function learners -- which creates a non-negligible gap between theory and practice. Closing the gap is a nontrivial challenge, as there is a lack of established ``textbook'' routine for finite sample analysis of such unsupervised problems. This work puts forth a finite-sample identifiability analysis of GCL-based nICA. Our analytical framework judiciously combines the properties of the GCL loss function, statistical generalization analysis, and numerical differentiation. Our framework also takes the learning function's approximation error into consideration, and reveals an intuitive trade-off between the complexity and expressiveness of the employed function learner. Numerical experiments are used to validate the theorems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题