论文标题
神经机器翻译的持续知识蒸馏
Continual Knowledge Distillation for Neural Machine Translation
论文作者
论文摘要
尽管许多平行的语料库都无法获得数据版权,数据隐私和竞争性差异原因,但在开放平台上越来越多地获得训练的翻译模型。在这项工作中,我们提出了一种称为持续知识蒸馏的方法,以利用现有的翻译模型来改善一种感兴趣的模型。基本思想是将知识从每个训练有素的模型转移到蒸馏模型。对中文英语和德语 - 英语数据集进行了广泛的实验表明,我们的方法在均质和异质训练的模型设置下对强大基准进行了重大和一致的改进,并且对恶意模型是强大的。
While many parallel corpora are not publicly accessible for data copyright, data privacy and competitive differentiation reasons, trained translation models are increasingly available on open platforms. In this work, we propose a method called continual knowledge distillation to take advantage of existing translation models to improve one model of interest. The basic idea is to sequentially transfer knowledge from each trained model to the distilled model. Extensive experiments on Chinese-English and German-English datasets show that our method achieves significant and consistent improvements over strong baselines under both homogeneous and heterogeneous trained model settings and is robust to malicious models.