用于常识性解释的基准阿拉伯数据集

论文标题

用于常识性解释的基准阿拉伯数据集

A Benchmark Arabic Dataset for Commonsense Explanation

论文作者

AL-Tawalbeh, Saja, AL-Smadi, Mohammad

论文摘要

语言理解和常识性知识验证机器是挑战性的任务，这些任务仍在研究和评估阿拉伯文本。在本文中，我们提出了一个基准的阿拉伯数据集用于常识性解释。数据集由阿拉伯语句子组成，这些句子与三个选择都不有意义，可以在其中选择一个句子，解释了句子为何是错误的。此外，本文提出了基线结果，以帮助和鼓励对该领域的研究的未来评估。数据集分布在Creative Commons CC-SA 4.0许可下，可以在GitHub上找到

Language comprehension and commonsense knowledge validation by machines are challenging tasks that are still under researched and evaluated for Arabic text. In this paper, we present a benchmark Arabic dataset for commonsense explanation. The dataset consists of Arabic sentences that does not make sense along with three choices to select among them the one that explains why the sentence is false. Furthermore, this paper presents baseline results to assist and encourage the future evaluation of research in this field. The dataset is distributed under the Creative Commons CC-BY-SA 4.0 license and can be found on GitHub

下载PDF全文

下载文献需遵守相关版权规定

论文标题