多任务多任务深神经网络的Microsoft工具包，用于自然语言理解

论文标题

多任务多任务深神经网络的Microsoft工具包，用于自然语言理解

The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding

论文作者

Liu, Xiaodong, Wang, Yu, Ji, Jianshu, Cheng, Hao, Zhu, Xueyun, Awa, Emmanuel, He, Pengcheng, Chen, Weizhu, Poon, Hoifung, Cao, Guihong, Gao, Jianfeng

论文摘要

我们提出了MT-DNN MT-DNN，这是一种开源的自然语言理解（NLU）工具包，使研究人员和开发人员可以轻松培训定制的深度学习模型。 MT-DNN建立在Pytorch和变压器上，旨在使用各种目标（分类，回归，结构化预测）和文本编码器（例如RNNS，Bert，Bert，Roberta，Roberta，Unilm）来促进广泛的NLU任务的快速自定义。 MT-DNN的一个独特功能是其内置支持使用对抗性的多任务学习范式来对可靠和可转移的学习。为了实现有效的生产部署，MT-DNN支持多任务知识蒸馏，该蒸馏可以实质上压缩深层神经模型而无需显着性能下降。我们证明了MT-DNN对一般和生物医学领域的广泛NLU应用的有效性。该软件和预培训模型将在https://github.com/namisan/mt-dnn上公开获得。

We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models. Built upon PyTorch and Transformers, MT-DNN is designed to facilitate rapid customization for a broad spectrum of NLU tasks, using a variety of objectives (classification, regression, structured prediction) and text encoders (e.g., RNNs, BERT, RoBERTa, UniLM). A unique feature of MT-DNN is its built-in support for robust and transferable learning using the adversarial multi-task learning paradigm. To enable efficient production deployment, MT-DNN supports multi-task knowledge distillation, which can substantially compress a deep neural model without significant performance drop. We demonstrate the effectiveness of MT-DNN on a wide range of NLU applications across general and biomedical domains. The software and pre-trained models will be publicly available at https://github.com/namisan/mt-dnn.

下载PDF全文

下载文献需遵守相关版权规定

论文标题