通过完整的3D图网络学习分层蛋白质表示

论文标题

通过完整的3D图网络学习分层蛋白质表示

Learning Hierarchical Protein Representations via Complete 3D Graph Networks

论文作者

Wang, Limei, Liu, Haoran, Liu, Yi, Kurtin, Jerry, Ji, Shuiwang

论文摘要

我们考虑具有3D结构的蛋白质的代表性学习。我们基于蛋白质结构构建3D图，并开发图形网络以学习其表示形式。根据我们希望捕获的细节水平，可以在不同的水平上计算蛋白质表示，\ emph {e.g。}，氨基酸，骨干或全原子水平。重要的是，不同层次之间存在层次关系。在这项工作中，我们建议开发一个新型的层次图网络（称为pronet）来捕获关系。我们的pronet非常灵活，可用于计算不同水平粒度水平的蛋白质表示。通过将每个氨基酸视为图形建模中的节点以及利用固有的层次结构，我们的PRONET比现有方法更有效，更有效。我们还表明，鉴于完整的基本3D图网络，我们的PRONET表示在所有级别上也已完成。实验结果表明，PRONET优于大多数数据集上的最新方法。此外，结果表明，不同的下游任务可能需要不同级别的表示。我们的代码作为DIG库的一部分公开可用（\ url {https://github.com/divelab/dig}）。

We consider representation learning for proteins with 3D structures. We build 3D graphs based on protein structures and develop graph networks to learn their representations. Depending on the levels of details that we wish to capture, protein representations can be computed at different levels, \emph{e.g.}, the amino acid, backbone, or all-atom levels. Importantly, there exist hierarchical relations among different levels. In this work, we propose to develop a novel hierarchical graph network, known as ProNet, to capture the relations. Our ProNet is very flexible and can be used to compute protein representations at different levels of granularity. By treating each amino acid as a node in graph modeling as well as harnessing the inherent hierarchies, our ProNet is more effective and efficient than existing methods. We also show that, given a base 3D graph network that is complete, our ProNet representations are also complete at all levels. Experimental results show that ProNet outperforms recent methods on most datasets. In addition, results indicate that different downstream tasks may require representations at different levels. Our code is publicly available as part of the DIG library (\url{https://github.com/divelab/DIG}).

下载PDF全文

下载文献需遵守相关版权规定

论文标题