论文标题
代码的语言模型是几乎没有常识的学习者
Language Models of Code are Few-Shot Commonsense Learners
论文作者
论文摘要
我们解决了结构化常识性推理的一般任务:鉴于自然语言输入,目标是生成诸如事件或推理图形之类的图形。为了在此任务中使用大型语言模型(LMS),现有方法``序列化''输出图作为节点和边缘的平坦列表。尽管是可行的,但这些序列化图与LMS预先训练的自然语言语料库有着强烈的偏离,阻碍了LMS正确生成它们。在本文中,我们表明,当我们将结构化的常识性推理任务作为代码生成任务时,预先训练的代码是更好的结构性常识性推理器,即使下游任务根本不涉及源代码,也是自然语言的LMS。我们展示了我们在三种不同结构化的理解推理任务中的方法。在所有这些自然语言任务中,我们表明,使用我们的方法,代码生成LM(codex)的表现优于在目标任务(例如T5)和其他强LMS(例如GPT-3)中进行微调的自然LMS,例如在几次射击设置中。
We address the general task of structured commonsense reasoning: given a natural language input, the goal is to generate a graph such as an event -- or a reasoning-graph. To employ large language models (LMs) for this task, existing approaches ``serialize'' the output graph as a flat list of nodes and edges. Although feasible, these serialized graphs strongly deviate from the natural language corpora that LMs were pre-trained on, hindering LMs from generating them correctly. In this paper, we show that when we instead frame structured commonsense reasoning tasks as code generation tasks, pre-trained LMs of code are better structured commonsense reasoners than LMs of natural language, even when the downstream task does not involve source code at all. We demonstrate our approach across three diverse structured commonsense reasoning tasks. In all these natural language tasks, we show that using our approach, a code generation LM (CODEX) outperforms natural-LMs that are fine-tuned on the target task (e.g., T5) and other strong LMs such as GPT-3 in the few-shot setting.