论文标题
蛋白质残留触点图预测的多尺度图神经网络
Multiscale Graph Neural Networks for Protein Residue Contact Map Prediction
论文作者
论文摘要
机器学习(ML)正在彻底改变蛋白质结构分析,包括预测蛋白质残基接触图的重要子问题,即,鉴于蛋白质的氨基酸序列,氨基酸残基在空间上是紧密的。尽管在基于ML的蛋白质接触预测中取得了最新进展,但预测与广泛距离的接触(通常分为短期,中和长距离接触)仍然是一个挑战。在这里,我们提出了一种以多尺度物理模拟提示的基于多尺度图的神经网络(GNN)方法,其中涉及复发性神经网络(RNN)的标准管道具有三种GNN,以优化短期,中,中等和长期残留的残基触点的预测能力。蛋白网数据集上的测试结果显示,使用拟议的多尺度RNN+GNN方法在常规方法上提高了所有范围接触的准确性,包括最具挑战性的远程接触预测案例。
Machine learning (ML) is revolutionizing protein structural analysis, including an important subproblem of predicting protein residue contact maps, i.e., which amino-acid residues are in close spatial proximity given the amino-acid sequence of a protein. Despite recent progresses in ML-based protein contact prediction, predicting contacts with a wide range of distances (commonly classified into short-, medium- and long-range contacts) remains a challenge. Here, we propose a multiscale graph neural network (GNN) based approach taking a cue from multiscale physics simulations, in which a standard pipeline involving a recurrent neural network (RNN) is augmented with three GNNs to refine predictive capability for short-, medium- and long-range residue contacts, respectively. Test results on the ProteinNet dataset show improved accuracy for contacts of all ranges using the proposed multiscale RNN+GNN approach over the conventional approach, including the most challenging case of long-range contact prediction.