论文标题
罕见事件检测的点过程模型
A point process model for rare event detection
论文作者
论文摘要
在罕见事件中,被定义为引起高影响力但发生的可能性很小的事件是在包括气象,环境,财务和经济在内的许多领域中的挑战。使用机器学习来检测此类事件的情况变得越来越流行,因为与传统的基于签名的检测方法相比,它们提供了有效且可扩展的解决方案。在这项工作中,我们首先进行探索性数据分析,并在框架中使用机器学习方法来实现罕见事件检测。还讨论了应对包括选择绩效指标在内的课程不平衡的策略。尽管它们很受欢迎,但我们认为传统机器学习分类器的性能可以进一步改善,因为随着事件发生的时间,它们对自然顺序不可知。另一方面,随机过程通过利用其时间结构(例如不同类型的事件之间的聚类和依赖性)来模型事件的模型序列。我们开发了一个基于霍克斯流程的分类模型,并将其应用于电子商务交易的数据集,不仅可以提出更好的预测性能,还可以得出有关数据时间动态的推论。
Detecting rare events, those defined to give rise to high impact but have a low probability of occurring, is a challenge in a number of domains including meteorological, environmental, financial and economic. The use of machine learning to detect such events is becoming increasingly popular, since they offer an effective and scalable solution when compared to traditional signature-based detection methods. In this work, we begin by undertaking exploratory data analysis, and present techniques that can be used in a framework for employing machine learning methods for rare event detection. Strategies to deal with the imbalance of classes including the selection of performance metrics are also discussed. Despite their popularity, we believe the performance of conventional machine learning classifiers could be further improved, since they are agnostic to the natural order over time in which the events occur. Stochastic processes on the other hand, model sequences of events by exploiting their temporal structure such as clustering and dependence between the different types of events. We develop a model for classification based on Hawkes processes and apply it to a dataset of e-commerce transactions, resulting in not only better predictive performance but also deriving inferences regarding the temporal dynamics of the data.