下一个单词预测的联合文本模型预处理

论文标题

下一个单词预测的联合文本模型预处理

Pretraining Federated Text Models for Next Word Prediction

论文作者

Stremmel, Joel, Singh, Arjun

论文摘要

联合学习是一种分散设备上培训模型的分散方法，通过汇总本地更改并将聚合参数从本地模型发送到云而不是数据本身。在这项研究中，我们采用了转移学习的想法，来进行下一个单词预测（NWP）的联合培训，并进行了许多实验，证明了对联邦NWP模型成功的当前基线的增强。具体而言，我们将联合训练基线从随机初始化的模型与训练方法的各种组合进行比较，包括验证的单词嵌入和整个模型预处理，然后在堆栈溢出柱数据集中为NWP进行联合的微调。我们使用验证的嵌入方式实现了性能的提升，而不会加剧所需的训练回合或记忆足迹的数量。我们还使用中央预处理的网络观察到显着的差异，尤其是取决于所使用的数据集。我们的研究为联邦NWP提供了有效但价格便宜的改进，并为更严格的转移学习技术进行了为联合学习而进行了更严格的实验。

Federated learning is a decentralized approach for training models on distributed devices, by summarizing local changes and sending aggregate parameters from local models to the cloud rather than the data itself. In this research we employ the idea of transfer learning to federated training for next word prediction (NWP) and conduct a number of experiments demonstrating enhancements to current baselines for which federated NWP models have been successful. Specifically, we compare federated training baselines from randomly initialized models to various combinations of pretraining approaches including pretrained word embeddings and whole model pretraining followed by federated fine tuning for NWP on a dataset of Stack Overflow posts. We realize lift in performance using pretrained embeddings without exacerbating the number of required training rounds or memory footprint. We also observe notable differences using centrally pretrained networks, especially depending on the datasets used. Our research offers effective, yet inexpensive, improvements to federated NWP and paves the way for more rigorous experimentation of transfer learning techniques for federated learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题