合奏半监督实体通过周期教学

论文标题

合奏半监督实体通过周期教学

Ensemble Semi-supervised Entity Alignment via Cycle-teaching

论文作者

Xin, Kexuan, Sun, Zequn, Hua, Wen, Liu, Bing, Hu, Wei, Qu, Jianfeng, Zhou, Xiaofang

论文摘要

实体对齐是在不同的知识图中找到相同的实体。尽管基于嵌入的实体一致性最近取得了显着进步，但培训数据不足仍然是一个至关重要的挑战。传统的半监督方法还遭受了新提出的培训数据中不正确的实体一致性。为了解决这些问题，我们为半监督实体对准设计了一个迭代的周期教学框架。关键的想法是同时训练多个实体对准模型（称为对齐器），并让每个对齐器迭代教授其继任者提出的新实体对齐。我们提出了一种多样性感知的一致性选择方法，以选择每个对齐器的可靠实体对齐方式。我们还设计了一种解决冲突的机制，以解决对准器的新一致性及其老师的新一致性时解决一致性冲突。此外，考虑到周期教学顺序的影响，我们精心设计了一种策略，以安排最大程度地提高多个对准器的整体性能的最佳顺序。周期教学过程可以打破每个模型学习能力的局限性，并降低新训练数据中的噪音，从而提高性能。基准数据集的广泛实验证明了拟议的周期教学框架的有效性，当训练数据不足并且新实体对准具有很大的噪音时，该框架的有效性大大优于最先进的模型。

Entity alignment is to find identical entities in different knowledge graphs. Although embedding-based entity alignment has recently achieved remarkable progress, training data insufficiency remains a critical challenge. Conventional semi-supervised methods also suffer from the incorrect entity alignment in newly proposed training data. To resolve these issues, we design an iterative cycle-teaching framework for semi-supervised entity alignment. The key idea is to train multiple entity alignment models (called aligners) simultaneously and let each aligner iteratively teach its successor the proposed new entity alignment. We propose a diversity-aware alignment selection method to choose reliable entity alignment for each aligner. We also design a conflict resolution mechanism to resolve the alignment conflict when combining the new alignment of an aligner and that from its teacher. Besides, considering the influence of cycle-teaching order, we elaborately design a strategy to arrange the optimal order that can maximize the overall performance of multiple aligners. The cycle-teaching process can break the limitations of each model's learning capability and reduce the noise in new training data, leading to improved performance. Extensive experiments on benchmark datasets demonstrate the effectiveness of the proposed cycle-teaching framework, which significantly outperforms the state-of-the-art models when the training data is insufficient and the new entity alignment has much noise.

下载PDF全文

下载文献需遵守相关版权规定

论文标题