通过大规模分位回归转移学习

论文标题

通过大规模分位回归转移学习

Transfer Learning with Large-Scale Quantile Regression

论文作者

Jin, Jun, Yan, Jun, Aseltine, Robert H., Chen, Kun

论文摘要

由于其稳健性和灵活性，在现代大数据应用程序中越来越多地遇到分位数回归。当可用的数据可能超出目标并从其他可能与目标具有相似之处的其他来源补充时，我们考虑学习特定目标人群的条件分位数的情况。一个关键的问题是如何正确区分和利用其他来源的有用信息来改善目标的分位数估计和推断。我们通过检测与目标相似的信息来源并利用它们来改善目标模型来开发用于高维分位数回归的转移学习方法。我们表明，在合理条件下，基于样本分裂的信息来源的检测是一致的。与仅具有目标数据的天真估计器相比，转移学习估计器的错误率随样本大小的函数，信噪比的比率以及目标和源模型之间的相似性测量值的函数要低得多。广泛的仿真研究证明了我们提出的方法的优势。我们采用我们的方法来解决检测飞行安全风险的问题，并显示从三种不同类型的飞机转移而获得的好处和见解：波音737，空中客车A320和空中客车A380。

Quantile regression is increasingly encountered in modern big data applications due to its robustness and flexibility. We consider the scenario of learning the conditional quantiles of a specific target population when the available data may go beyond the target and be supplemented from other sources that possibly share similarities with the target. A crucial question is how to properly distinguish and utilize useful information from other sources to improve the quantile estimation and inference at the target. We develop transfer learning methods for high-dimensional quantile regression by detecting informative sources whose models are similar to the target and utilizing them to improve the target model. We show that under reasonable conditions, the detection of the informative sources based on sample splitting is consistent. Compared to the naive estimator with only the target data, the transfer learning estimator achieves a much lower error rate as a function of the sample sizes, the signal-to-noise ratios, and the similarity measures among the target and the source models. Extensive simulation studies demonstrate the superiority of our proposed approach. We apply our methods to tackle the problem of detecting hard-landing risk for flight safety and show the benefits and insights gained from transfer learning of three different types of airplanes: Boeing 737, Airbus A320, and Airbus A380.

下载PDF全文

下载文献需遵守相关版权规定

论文标题