论文标题
图形神经网络强烈递增的选区解析
Strongly Incremental Constituency Parsing with Graph Neural Networks
论文作者
论文摘要
将句子解析为语法树可以使NLP中的下游应用程序受益。基于过渡的解析器通过在状态过渡系统中执行动作来建立树。它们在计算上是有效的,并且可以利用机器学习来预测基于部分树的动作。但是,现有的基于过渡的解析器主要基于Shift-Reduce过渡系统,这与人类对解析句子的了解不符。心理语言学研究表明,人类解析是强烈的渐进性:人类通过在每个步骤中添加一个令牌来生长一棵单一的解析树。在本文中,我们提出了一个称为actact-juxtapose的新型过渡系统。它是强烈的增量;它代表使用一棵树的部分句子;每个动作都将一个令牌添加到部分树中。根据我们的过渡系统,我们开发了一个强烈的分析器。在每个步骤中,它都使用图神经网络编码部分树并预测动作。我们在Penn Treebank(PTB)和中国Treebank(CTB)上评估我们的解析器。在PTB上,它胜过仅使用选区树训练的现有解析器。它与使用依赖树作为其他培训数据的最先进的解析器相同。在CTB上,我们的解析器建立了新的最新状态。代码可从https://github.com/princeton-vl/attach-juxtapose-parser获得。
Parsing sentences into syntax trees can benefit downstream applications in NLP. Transition-based parsers build trees by executing actions in a state transition system. They are computationally efficient, and can leverage machine learning to predict actions based on partial trees. However, existing transition-based parsers are predominantly based on the shift-reduce transition system, which does not align with how humans are known to parse sentences. Psycholinguistic research suggests that human parsing is strongly incremental: humans grow a single parse tree by adding exactly one token at each step. In this paper, we propose a novel transition system called attach-juxtapose. It is strongly incremental; it represents a partial sentence using a single tree; each action adds exactly one token into the partial tree. Based on our transition system, we develop a strongly incremental parser. At each step, it encodes the partial tree using a graph neural network and predicts an action. We evaluate our parser on Penn Treebank (PTB) and Chinese Treebank (CTB). On PTB, it outperforms existing parsers trained with only constituency trees; and it performs on par with state-of-the-art parsers that use dependency trees as additional training data. On CTB, our parser establishes a new state of the art. Code is available at https://github.com/princeton-vl/attach-juxtapose-parser.