论文标题

厨师:用于基于证据的事实检查的飞行员中文数据集

CHEF: A Pilot Chinese Dataset for Evidence-Based Fact-Checking

论文作者

Hu, Xuming, Guo, Zhijiang, Wu, Guanyu, Liu, Aiwei, Wen, Lijie, Yu, Philip S.

论文摘要

在媒体生态系统中,错误信息传播的爆炸敦促进行自动事实检查。尽管错误信息涵盖了地理和语言界限,但该领域的大多数工作都集中在英语上。其他语言(例如中文)可用的数据集和工具受到限制。为了弥合这一差距,我们构建了厨师,这是中国第一个基于循证的事实检验数据集的10K现实世界主张。该数据集涵盖了从政治到公共卫生的多个领域,并提供了从互联网检索的带注释的证据。此外,我们开发了既定的基准和一种新颖的方法,能够将证据检索作为潜在变量建模,从而以端到端的方式共同使用真实性预测模型进行共同训练。广泛的实验表明,厨师将为开发旨在检索和推理非英语主张的事实检查系统提供挑战性的测试台。

The explosion of misinformation spreading in the media ecosystem urges for automated fact-checking. While misinformation spans both geographic and linguistic boundaries, most work in the field has focused on English. Datasets and tools available in other languages, such as Chinese, are limited. In order to bridge this gap, we construct CHEF, the first CHinese Evidence-based Fact-checking dataset of 10K real-world claims. The dataset covers multiple domains, ranging from politics to public health, and provides annotated evidence retrieved from the Internet. Further, we develop established baselines and a novel approach that is able to model the evidence retrieval as a latent variable, allowing jointly training with the veracity prediction model in an end-to-end fashion. Extensive experiments show that CHEF will provide a challenging testbed for the development of fact-checking systems designed to retrieve and reason over non-English claims.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源