论文标题
DDG-DA:可预测概念漂移适应的数据发行
DDG-DA: Data Distribution Generation for Predictable Concept Drift Adaptation
论文作者
论文摘要
在许多实际情况下,我们经常处理随着时间的流逝会依次收集的流数据。由于环境的非平稳性,流数据分布可能以不可预测的方式变化,这被称为概念漂移。为了处理概念漂移,先前的方法首先检测概念漂移何时发生/何时发生,然后调整模型以适合最新数据的分布。但是,仍然有很多情况是,某些潜在的环境进化因素是可以预测的,这使得可以对流数据的未来概念漂移趋势进行建模,而在先前的工作中未完全探索此类案例。 在本文中,我们提出了一种新型的DDG-DA,可以有效地预测数据分布的演变并改善模型的性能。具体来说,我们首先训练一个预测因子来估计未来的数据分布,然后利用它来生成培训样本,最后在生成的数据上训练模型。我们对三个现实世界任务进行实验(预测股票价格趋势,电力负载和太阳辐照度),并在多种广泛使用的模型上获得了显着改进。
In many real-world scenarios, we often deal with streaming data that is sequentially collected over time. Due to the non-stationary nature of the environment, the streaming data distribution may change in unpredictable ways, which is known as concept drift. To handle concept drift, previous methods first detect when/where the concept drift happens and then adapt models to fit the distribution of the latest data. However, there are still many cases that some underlying factors of environment evolution are predictable, making it possible to model the future concept drift trend of the streaming data, while such cases are not fully explored in previous work. In this paper, we propose a novel method DDG-DA, that can effectively forecast the evolution of data distribution and improve the performance of models. Specifically, we first train a predictor to estimate the future data distribution, then leverage it to generate training samples, and finally train models on the generated data. We conduct experiments on three real-world tasks (forecasting on stock price trend, electricity load and solar irradiance) and obtain significant improvement on multiple widely-used models.