论文标题
通过基于密度的度量学习的固有持续同源性
Intrinsic persistent homology via density-based metric learning
论文作者
论文摘要
我们解决了在多种假设下从高维欧几里德空间中数据估算拓扑特征的问题。我们的方法是基于持续的数据点的持续同源性计算,该数据点具有称为Fermat距离的样本度量。我们证明,这种度量空间几乎可以肯定地收敛到具有固有度量的歧管本身,该指标既说明了流形的几何形状,又是产生样品的密度。这一事实意味着相关持续图的收敛性。当计算持续同源性时,使用此内在距离具有有利的属性,例如对输入数据中异常值的存在鲁棒性,并且对环境空间中基础歧管的特定嵌入的敏感性较小。我们使用这些想法提出并实施了一种时间序列中的模式识别和异常检测方法,该方法在真实数据的应用中进行了评估。
We address the problem of estimating topological features from data in high dimensional Euclidean spaces under the manifold assumption. Our approach is based on the computation of persistent homology of the space of data points endowed with a sample metric known as Fermat distance. We prove that such metric space converges almost surely to the manifold itself endowed with an intrinsic metric that accounts for both the geometry of the manifold and the density that produces the sample. This fact implies the convergence of the associated persistence diagrams. The use of this intrinsic distance when computing persistent homology presents advantageous properties such as robustness to the presence of outliers in the input data and less sensitiveness to the particular embedding of the underlying manifold in the ambient space. We use these ideas to propose and implement a method for pattern recognition and anomaly detection in time series, which is evaluated in applications to real data.