论文标题
与结构化正则化的高维度的规范相关分析
Canonical Correlation Analysis in high dimensions with structured regularization
论文作者
论文摘要
规范相关分析(CCA)是一种用于测量两个多元数据矩阵之间关联的技术。对CCA系数施加$ \ ell_2 $惩罚的规范相关分析(RCCA)的正则修改被广泛用于具有高维数据的应用中。这种正则化的一个限制是,它忽略了任何数据结构,可以平等处理所有功能,这可能不适合某些应用程序。在本文中,我们介绍了将基础数据结构考虑在内的几种正规化CCA的方法。特别是,当变量分组相关时,提出的组正规化规范相关分析(GRCCA)很有用。我们说明了一些计算策略,以避免使用高维度正规的CCA过度计算。我们证明了这些方法在神经科学的激励应用中以及在一个小型模拟示例中的应用中的应用。
Canonical correlation analysis (CCA) is a technique for measuring the association between two multivariate data matrices. A regularized modification of canonical correlation analysis (RCCA) which imposes an $\ell_2$ penalty on the CCA coefficients is widely used in applications with high-dimensional data. One limitation of such regularization is that it ignores any data structure, treating all the features equally, which can be ill-suited for some applications. In this paper we introduce several approaches to regularizing CCA that take the underlying data structure into account. In particular, the proposed group regularized canonical correlation analysis (GRCCA) is useful when the variables are correlated in groups. We illustrate some computational strategies to avoid excessive computations with regularized CCA in high dimensions. We demonstrate the application of these methods in our motivating application from neuroscience, as well as in a small simulation example.