论文标题
半参数非线性二分图表示学习具有可证明的保证
Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees
论文作者
论文摘要
图表示学习是机器学习中无处不在的任务,目标是将每个顶点嵌入低维矢量空间中。我们将两分的图表视为在半参数指数家庭分布中参数的统计估计问题。假定两分图是通过半参数指数族分布生成的,其参数分量是通过两个单层神经网络的输出的接近来给出的,而非参数(nuisance)组件是基本度量。神经网络具有高维特征作为输入和输出嵌入向量。在这种情况下,表示学习问题等同于恢复重量矩阵。估计的主要挑战是由激活函数的非线性和分布的非参数滋扰成分引起的。为了克服这些挑战,我们提出了一个基于等级分解技术的伪样目标,并专注于其局部几何形状。我们表明,所提出的目标在地面真理周围的邻域中强烈凸出,因此基于梯度下降的方法达到线性收敛率。此外,我们证明了问题的样本复杂性在维度(到对数因素)中是线性的,这与参数高斯模型一致。但是,我们的估计器对指数族中的任何模型错误指定都是可靠的,该模型在广泛的实验中得到了验证。
Graph representation learning is a ubiquitous task in machine learning where the goal is to embed each vertex into a low-dimensional vector space. We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution. The bipartite graph is assumed to be generated by a semiparametric exponential family distribution, whose parametric component is given by the proximity of outputs of two one-layer neural networks, while nonparametric (nuisance) component is the base measure. Neural networks take high-dimensional features as inputs and output embedding vectors. In this setting, the representation learning problem is equivalent to recovering the weight matrices. The main challenges of estimation arise from the nonlinearity of activation functions and the nonparametric nuisance component of the distribution. To overcome these challenges, we propose a pseudo-likelihood objective based on the rank-order decomposition technique and focus on its local geometry. We show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate. Moreover, we prove that the sample complexity of the problem is linear in dimensions (up to logarithmic factors), which is consistent with parametric Gaussian models. However, our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.