结合标签传播和简单模型超过绩效图形神经网络

论文标题

结合标签传播和简单模型超过绩效图形神经网络

Combining Label Propagation and Simple Models Out-performs Graph Neural Networks

论文作者

Huang, Qian, He, Horace, Singh, Abhay, Lim, Ser-Nam, Benson, Austin R.

论文摘要

图神经网络（GNN）是用于图形学习的主要技术。但是，对GNN为何在实践中取得成功以及是否需要良好表现的必要性的理解相对较少。 Here, we show that for many standard transductive node classification benchmarks, we can exceed or match the performance of state-of-the-art GNNs by combining shallow models that ignore the graph structure with two simple post-processing steps that exploit correlation in the label structure: (i) an "error correlation" that spreads residual errors in training data to correct errors in test data and (ii) a "prediction correlation" that smooths the predictions on测试数据。我们称此整体过程正确且平滑（C＆S），并且通过简单的修改对标准标签传播技术的简单修改来实现后处理步骤，从早期基于图形的半监督学习方法中实现。我们的方法超过或几乎与最先进的GNN在各种基准测试中的性能相匹配，只有一小部分参数和数量级的运行时。例如，我们超过了OGB产品数据集中最著名的GNN性能，参数少137倍，训练时间少于100倍以上。我们方法的性能突显了如何将标签信息直接纳入学习算法（如传统技术中所做的）如何产生简单而实质性的性能。我们还可以将技术纳入大型GNN模型中，从而提供适度的收益。我们的OGB结果代码在https://github.com/chillee/correctandsmooth上。

Graph Neural Networks (GNNs) are the predominant technique for learning over graphs. However, there is relatively little understanding of why GNNs are successful in practice and whether they are necessary for good performance. Here, we show that for many standard transductive node classification benchmarks, we can exceed or match the performance of state-of-the-art GNNs by combining shallow models that ignore the graph structure with two simple post-processing steps that exploit correlation in the label structure: (i) an "error correlation" that spreads residual errors in training data to correct errors in test data and (ii) a "prediction correlation" that smooths the predictions on the test data. We call this overall procedure Correct and Smooth (C&S), and the post-processing steps are implemented via simple modifications to standard label propagation techniques from early graph-based semi-supervised learning methods. Our approach exceeds or nearly matches the performance of state-of-the-art GNNs on a wide variety of benchmarks, with just a small fraction of the parameters and orders of magnitude faster runtime. For instance, we exceed the best known GNN performance on the OGB-Products dataset with 137 times fewer parameters and greater than 100 times less training time. The performance of our methods highlights how directly incorporating label information into the learning algorithm (as was done in traditional techniques) yields easy and substantial performance gains. We can also incorporate our techniques into big GNN models, providing modest gains. Our code for the OGB results is at https://github.com/Chillee/CorrectAndSmooth.

下载PDF全文

下载文献需遵守相关版权规定

论文标题