SPT：半参数提示调整多任务提示学习

论文标题

SPT：半参数提示调整多任务提示学习

SPT: Semi-Parametric Prompt Tuning for Multitask Prompted Learning

论文作者

Bari, M Saiful, Zhang, Aston, Zheng, Shuai, Shi, Xingjian, Zhu, Yi, Joty, Shafiq, Li, Mu

论文摘要

预先训练的大型语言模型可以自然地有效地插入人写的提示。多任务促使学习可以立即通过各种任务来帮助概括，从而增强了更有效的下游微调的潜力。为了在同一批处理中执行有效的多任务推动，已经提出了参数有效的微调方法，例如及时调整。但是，现有的及时调整方法可能缺乏概括。我们提出了SPT，这是一种半参数提示调谐方法，用于多任务提示学习。 SPT的新颖组成部分是根据离散提示检索内存提示的内存库。广泛的实验，例如（i）通过SPT进行微调模型，对来自8个不同域的31个不同任务进行微调模型，并在5个NLP任务类别下的9个Holdout数据集中评估零弹药的概括，以及（ii）在GLUE数据集上预处理SPT，并在SPERGLUE数据集中评估SPT的SPER效率。

Pre-trained large language models can efficiently interpolate human-written prompts in a natural way. Multitask prompted learning can help generalization through a diverse set of tasks at once, thus enhancing the potential for more effective downstream fine-tuning. To perform efficient multitask-inference in the same batch, parameter-efficient fine-tuning methods such as prompt tuning have been proposed. However, the existing prompt tuning methods may lack generalization. We propose SPT, a semi-parametric prompt tuning method for multitask prompted learning. The novel component of SPT is a memory bank from where memory prompts are retrieved based on discrete prompts. Extensive experiments, such as (i) fine-tuning a full language model with SPT on 31 different tasks from 8 different domains and evaluating zero-shot generalization on 9 heldout datasets under 5 NLP task categories and (ii) pretraining SPT on the GLUE datasets and evaluating fine-tuning on the SuperGLUE datasets, demonstrate effectiveness of SPT.

下载PDF全文

下载文献需遵守相关版权规定

论文标题