Gollum：大型多源知识图匹配的黄金标准

论文标题

Gollum：大型多源知识图匹配的黄金标准

Gollum: A Gold Standard for Large Scale Multi Source Knowledge Graph Matching

论文作者

Hertling, Sven, Paulheim, Heiko

论文摘要

用自动和手动方法生成的知识图（kgs）的数量正在不断增长。对于集成的视图和用法，在架构和实例级别上必须对这些kgs进行一致性。尽管有一些方法试图解决这个多源知识匹配问题，但缺少大黄金标准来评估其有效性和可扩展性。我们通过呈现Gollum来缩小这一差距 - 大规模多源知识图匹配的黄金标准，在4,149个不同的公里之间，超过275,000个对应关系。它们源自通过将DBPEDIA提取框架应用于大型Wiki农场而得出的知识图。提供了三种黄金标准的变体：（1）一个具有评估无监督匹配方法的所有信件的版本，以及两个用于评估监督匹配的版本：（2）一个在火车和测试集中都包含每个kg的版本，（3）每个kg在火车或测试集中仅包含每个kg。

The number of Knowledge Graphs (KGs) generated with automatic and manual approaches is constantly growing. For an integrated view and usage, an alignment between these KGs is necessary on the schema as well as instance level. While there are approaches that try to tackle this multi source knowledge graph matching problem, large gold standards are missing to evaluate their effectiveness and scalability. We close this gap by presenting Gollum -- a gold standard for large-scale multi source knowledge graph matching with over 275,000 correspondences between 4,149 different KGs. They originate from knowledge graphs derived by applying the DBpedia extraction framework to a large wiki farm. Three variations of the gold standard are made available: (1) a version with all correspondences for evaluating unsupervised matching approaches, and two versions for evaluating supervised matching: (2) one where each KG is contained both in the train and test set, and (3) one where each KG is exclusively contained in the train or the test set.

下载PDF全文

下载文献需遵守相关版权规定

论文标题