重新访问模块化多语言NMT以满足工业需求

论文标题

重新访问模块化多语言NMT以满足工业需求

Revisiting Modularized Multilingual NMT to Meet Industrial Demands

论文作者

Lyu, Sungwon, Son, Bokyung, Yang, Kichang, Bae, Jaekyoung

论文摘要

多语言翻译（1-1）的参数的完整共享是当前研究的主流方法。但是，由于容量瓶颈和低维护性，性能降低了，这阻碍了其在行业中的广泛采用。在这项研究中，我们重新访问了多语言神经机器翻译模型，该模型仅在同一语言之间共享模块（M2）作为1-1的实际替代方案，以满足工业需求。通过全面的实验，我们确定了多道路训练的好处，并证明M2可以享受这些好处而不会遭受能力瓶颈的困扰。此外，M2的语言空间可以方便地修改模型。通过利用训练有素的模块，我们发现逐步添加的模块比单身训练的模型表现更好。添加模块的零射击性能甚至与监督模型相当。我们的发现表明，M2可以成为行业多语言翻译的合格候选人。

The complete sharing of parameters for multilingual translation (1-1) has been the mainstream approach in current research. However, degraded performance due to the capacity bottleneck and low maintainability hinders its extensive adoption in industries. In this study, we revisit the multilingual neural machine translation model that only share modules among the same languages (M2) as a practical alternative to 1-1 to satisfy industrial requirements. Through comprehensive experiments, we identify the benefits of multi-way training and demonstrate that the M2 can enjoy these benefits without suffering from the capacity bottleneck. Furthermore, the interlingual space of the M2 allows convenient modification of the model. By leveraging trained modules, we find that incrementally added modules exhibit better performance than singly trained models. The zero-shot performance of the added modules is even comparable to supervised models. Our findings suggest that the M2 can be a competent candidate for multilingual translation in industries.

下载PDF全文

下载文献需遵守相关版权规定

论文标题