针对任务导向对话框生成的量身定制的预训练模型

论文标题

针对任务导向对话框生成的量身定制的预训练模型

A Tailored Pre-Training Model for Task-Oriented Dialog Generation

论文作者

Gu, Jing, Wu, Qingyang, Wu, Chongruo, Shi, Weiyan, Yu, Zhou

论文摘要

诸如BERT和GPT-2等大型预训练的语言模型的最新成功表明，将语言先验纳入下游对话生成任务的有效性。但是，对话框任务上预训练模型的性能并不像预期的那样最佳。在本文中，我们提出了专门针对以任务为导向的对话系统设计的预先训练的角色交替语言模型（PRAL）。我们采用（Wu等，2019），分别对两个说话者进行了建模。我们还设计了几种技术，例如开始位置随机化，知识蒸馏和历史折扣，以提高训练性绩效。我们通过清洁13个现有数据集引入了面向任务的对话框预处理数据集。我们在三个不同的下游任务上测试PRAL。结果表明，PRAL的性能更好或与最新方法相提并论。

The recent success of large pre-trained language models such as BERT and GPT-2 has suggested the effectiveness of incorporating language priors in downstream dialog generation tasks. However, the performance of pre-trained models on the dialog task is not as optimal as expected. In this paper, we propose a Pre-trained Role Alternating Language model (PRAL), designed specifically for task-oriented conversational systems. We adopted (Wu et al., 2019) that models two speakers separately. We also design several techniques, such as start position randomization, knowledge distillation, and history discount to improve pre-training performance. We introduce a task-oriented dialog pretraining dataset by cleaning 13 existing data sets. We test PRAL on three different downstream tasks. The results show that PRAL performs better or on par with state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题