论文标题

广义内核正规化最小二乘

Generalized Kernel Regularized Least Squares

论文作者

Chang, Qing, Goplerud, Max

论文摘要

内核正规化最小二乘(KRLS)是一种流行的方法,可以灵活地估算可能在变量之间具有复杂关系的模型。但是,其对许多研究人员的有用性受到限制,原因有两个。首先,现有的方法不灵活,不允许KRL与理论上动机的扩展(例如随机效应,未注册固定效应或非高斯结果)结合使用。其次,对于甚至适度尺寸的数据集,估计在计算上是极高的。我们的论文通过引入广义KRL(GKRL)来解决这两种问题。我们注意到,可以将KRLS重新形成为分层模型,从而允许轻松的推断和模块化模型构建,其中可以将KRL与随机效应,样条和未注册的固定效果一起使用。在计算上,我们还实施随机素描以极大地加速估计,同时估计质量的惩罚有限。我们证明,GKRL可以在不到一分钟内进行数万观察的数据集中。此外,可以迅速估计需要在十二次(例如元学习者)中安装模型的最新技术。

Kernel Regularized Least Squares (KRLS) is a popular method for flexibly estimating models that may have complex relationships between variables. However, its usefulness to many researchers is limited for two reasons. First, existing approaches are inflexible and do not allow KRLS to be combined with theoretically-motivated extensions such as random effects, unregularized fixed effects, or non-Gaussian outcomes. Second, estimation is extremely computationally intensive for even modestly sized datasets. Our paper addresses both concerns by introducing generalized KRLS (gKRLS). We note that KRLS can be re-formulated as a hierarchical model thereby allowing easy inference and modular model construction where KRLS can be used alongside random effects, splines, and unregularized fixed effects. Computationally, we also implement random sketching to dramatically accelerate estimation while incurring a limited penalty in estimation quality. We demonstrate that gKRLS can be fit on datasets with tens of thousands of observations in under one minute. Further, state-of-the-art techniques that require fitting the model over a dozen times (e.g. meta-learners) can be estimated quickly.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源