论文标题
Pagerank算法的连续限制
A continuum limit for the PageRank algorithm
论文作者
论文摘要
半监督和无监督的机器学习方法通常依靠图来建模数据,从而促使对操作员在图形上的理论特性如何利用在学习问题中的研究。尽管大多数现有文献都集中在无向图上,但在实践中,有向图非常重要,为物理,生物或运输网络提供了模型,以及许多其他应用。在本文中,我们提出了一个新的框架,用于严格研究有向图的学习算法的连续限制。我们使用新框架来研究Pagerank算法,并展示如何将其解释为涉及归一化图Laplacian类型的有向图上的数值方案。我们表明,将相应的连续性限制问题视为网页的数量增长到无穷大,是二阶,可能是退化的椭圆方程,其中包含反应,扩散和对流项。我们证明,数值方案是一致且稳定的,并且计算离散溶液与连续限量PDE解决方案的明确收敛速率。我们将PageRank Vector的稳定性和渐近规则性提供了应用。最后,我们通过数值实验说明了我们的结果,并探索了数据深度的应用。
Semi-supervised and unsupervised machine learning methods often rely on graphs to model data, prompting research on how theoretical properties of operators on graphs are leveraged in learning problems. While most of the existing literature focuses on undirected graphs, directed graphs are very important in practice, giving models for physical, biological, or transportation networks, among many other applications. In this paper, we propose a new framework for rigorously studying continuum limits of learning algorithms on directed graphs. We use the new framework to study the PageRank algorithm, and show how it can be interpreted as a numerical scheme on a directed graph involving a type of normalized graph Laplacian. We show that the corresponding continuum limit problem, which is taken as the number of webpages grows to infinity, is a second-order, possibly degenerate, elliptic equation that contains reaction, diffusion, and advection terms. We prove that the numerical scheme is consistent and stable and compute explicit rates of convergence of the discrete solution to the solution of the continuum limit PDE. We give applications to proving stability and asymptotic regularity of the PageRank vector. Finally, we illustrate our results with numerical experiments and explore an application to data depth.