论文标题
关系注意:概括图形结构任务的变压器
Relational Attention: Generalizing Transformers for Graph-Structured Tasks
论文作者
论文摘要
变形金刚在代表特定于任务的实体及其属性的一组实值的向量上灵活地操作,其中每个向量可能会在其中编码一个单词式令牌及其位置,或者以序列的序列或某些根本不带有位置的信息。但是作为设定的处理器,变压器在推理更通用的图形结构数据方面处于不利地位,其中节点代表实体和边缘代表实体之间的关系。为了解决这一缺点,我们将注意力的注意力概括为考虑和更新每个变压器层中的边缘向量。我们在各种图形结构任务上评估了这种关系变压器,包括大型且具有挑战性的CLR算法推理基准。在那里,它大大优于最先进的图形神经网络,该网络明确地设计出了对图形结构化数据的推论。我们的分析表明,这些收益归因于关系注意力的固有能力,即利用图表的表现力更大。
Transformers flexibly operate over sets of real-valued vectors representing task-specific entities and their attributes, where each vector might encode one word-piece token and its position in a sequence, or some piece of information that carries no position at all. But as set processors, transformers are at a disadvantage in reasoning over more general graph-structured data where nodes represent entities and edges represent relations between entities. To address this shortcoming, we generalize transformer attention to consider and update edge vectors in each transformer layer. We evaluate this relational transformer on a diverse array of graph-structured tasks, including the large and challenging CLRS Algorithmic Reasoning Benchmark. There, it dramatically outperforms state-of-the-art graph neural networks expressly designed to reason over graph-structured data. Our analysis demonstrates that these gains are attributable to relational attention's inherent ability to leverage the greater expressivity of graphs over sets.