SLK-NER：利用二阶词典知识的中文知识

论文标题

SLK-NER：利用二阶词典知识的中文知识

SLK-NER: Exploiting Second-order Lexicon Knowledge for Chinese NER

论文作者

Hu, Dou, Wei, Lingwei

论文摘要

尽管使用词典的基于角色的模型已经为中国命名实体识别（NER）任务取得了令人鼓舞的结果，但由于错误匹配的单词，一些词汇单词会引入错误的信息。现有研究提出了许多策略来整合词典知识。但是，他们以简单的一阶词典知识来表演，这提供了不足的单词信息，并且仍然面临着匹配的单词边界冲突的挑战。或通过图探索词典知识，其中介绍负词的高阶信息可能会干扰识别。为了减轻上述局限性，我们对句子中每个字符的二阶词典知识（SLK）提出了新的见解，以提供更多的词汇单词信息，包括语义和单词边界特征。基于这些，我们提出了一个基于SLK的模型，该模型具有新的策略来整合上述词典知识。所提出的模型可以借助全球环境来利用更明显的词汇单词信息。三个公共数据集的实验结果证明了SLK的有效性。所提出的模型比最新的比较方法获得了更出色的性能。

Although character-based models using lexicon have achieved promising results for Chinese named entity recognition (NER) task, some lexical words would introduce erroneous information due to wrongly matched words. Existing researches proposed many strategies to integrate lexicon knowledge. However, they performed with simple first-order lexicon knowledge, which provided insufficient word information and still faced the challenge of matched word boundary conflicts; or explored the lexicon knowledge with graph where higher-order information introducing negative words may disturb the identification. To alleviate the above limitations, we present new insight into second-order lexicon knowledge (SLK) of each character in the sentence to provide more lexical word information including semantic and word boundary features. Based on these, we propose a SLK-based model with a novel strategy to integrate the above lexicon knowledge. The proposed model can exploit more discernible lexical words information with the help of global context. Experimental results on three public datasets demonstrate the validity of SLK. The proposed model achieves more excellent performance than the state-of-the-art comparison methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题