论文标题
计数数据的张量分解,以利用随机和确定性优化
Tensor Decompositions for Count Data that Leverage Stochastic and Deterministic Optimization
论文作者
论文摘要
将低级矩阵分解延伸到多路阵列或张量的兴趣越来越大。一个基本的低量张量分解是规范多核分解(CPD)。将低级别的非负CPD模型拟合到泊松分布的计数数据的挑战是特别感兴趣的。几种流行的算法使用局部搜索方法来近似Poisson CPD模型的最大似然估计器(MLE)。这项工作提出了两种新算法,这些算法扩展了Poisson CPD的最新本地方法。混合GCP-CPAPR将广义的规范分解(GCP)与随机优化和CP交流泊松回归(CPAPR)(一种确定性算法)相结合,以增加单独使用的MLE收敛到MLE的可能性。使用SVDROP重新启动的CPAPR使用基于CPD模型展开的单数值的启发式措施,以识别与优化问题可行域内的不是MLE和重新启动的优化器的收敛,从而在使用多启动策略时降低了整体计算问题。我们提供的经验证据表明我们的方法在融合泊松CPD MLE方面表现优于现有方法。
There is growing interest to extend low-rank matrix decompositions to multi-way arrays, or tensors. One fundamental low-rank tensor decomposition is the canonical polyadic decomposition (CPD). The challenge of fitting a low-rank, nonnegative CPD model to Poisson-distributed count data is of particular interest. Several popular algorithms use local search methods to approximate the maximum likelihood estimator (MLE) of the Poisson CPD model. This work presents two new algorithms that extend state-of-the-art local methods for Poisson CPD. Hybrid GCP-CPAPR combines Generalized Canonical Decomposition (GCP) with stochastic optimization and CP Alternating Poisson Regression (CPAPR), a deterministic algorithm, to increase the probability of converging to the MLE over either method used alone. Restarted CPAPR with SVDrop uses a heuristic based on the singular values of the CPD model unfoldings to identify convergence toward optimizers that are not the MLE and restarts within the feasible domain of the optimization problem, thus reducing overall computational cost when using a multi-start strategy. We provide empirical evidence that indicates our approaches outperform existing methods with respect to converging to the Poisson CPD MLE.