论文标题
自我引导的无噪声数据生成,以进行有效的零击学习
Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning
论文作者
论文摘要
进一步探索大型预训练的语言模型(PLM)的零拍学习潜力的兴趣令人振奋。一种称为基于数据产生的零击学习的新范式取得了令人印象深刻的成功。在此范式中,来自PLM的合成数据充当知识的载体,该数据用于训练特定于任务的模型,其参数少于PLM,比PLM少了,比PLMS上的基于迅速的零光学习方法更高的性能和效率。这种方法的主要障碍是,来自PLM的合成数据通常包含很大一部分的低质量样品。在此类数据上拟合将极大地阻碍特定于任务模型的性能,从而使其无法进行部署。以前的方法主要是通过使用启发式指标(例如,输出信心)过滤综合数据或在人类专家的帮助下完善数据,该方法伴随着过多的手动调整或昂贵的成本来完善数据。在本文中,我们提出了一个新型的噪声重新加权框架,以自动构建高质量数据,以解决零射击分类问题。我们的框架具有学习样品权重的能力,表明数据质量而无需任何人类注释。我们从理论上和经验上验证了我们方法有助于构建优质合成数据集的能力。值得注意的是,Sungen-LSTM在八个不同已建立的文本分类任务中的平均准确性比基线的相对改善得出9.8%。
There is a rising interest in further exploring the zero-shot learning potential of large pre-trained language models (PLMs). A new paradigm called data-generation-based zero-shot learning has achieved impressive success. In this paradigm, the synthesized data from the PLM acts as the carrier of knowledge, which is used to train a task-specific model with orders of magnitude fewer parameters than the PLM, achieving both higher performance and efficiency than prompt-based zero-shot learning methods on PLMs. The main hurdle of this approach is that the synthesized data from PLM usually contains a significant portion of low-quality samples. Fitting on such data will greatly hamper the performance of the task-specific model, making it unreliable for deployment. Previous methods remedy this issue mainly by filtering synthetic data using heuristic metrics(e.g., output confidence), or refining the data with the help of a human expert, which comes with excessive manual tuning or expensive costs. In this paper, we propose a novel noise-robust re-weighting framework SunGen to automatically construct high-quality data for zero-shot classification problems. Our framework features the ability to learn the sample weights indicating data quality without requiring any human annotation. We theoretically and empirically verify the ability of our method to help construct good-quality synthetic datasets. Notably, SunGen-LSTM yields a 9.8% relative improvement than the baseline on average accuracy across eight different established text classification tasks.