高召回数据对文本的生成，并进行渐进编辑

论文标题

高召回数据对文本的生成，并进行渐进编辑

High Recall Data-to-text Generation with Progressive Edit

论文作者

Kim, Choonghan, Lee, Gary Geunbae

论文摘要

数据到文本（D2T）生成是从结构化输入生成文本的任务。我们观察到，当重复两次相同的目标句子时，基于变压器（T5）的模型会产生由结构化输入的不对称句子组成的输出。换句话说，这些句子的长度和质量不同。我们将这种现象称为“不对称产生”，并在D2T生成中利用了这一现象。生成不对称句子后，我们将使用无重复的目标添加输出的第一部分。随着这通过渐进编辑（PROEDIT），召回增加。因此，此方法比在编辑之前更好地涵盖了结构化输入。证明是提高D2T生成性能的一种简单但有效的方法，它在Totto数据集中实现了新的状态。

Data-to-text (D2T) generation is the task of generating texts from structured inputs. We observed that when the same target sentence was repeated twice, Transformer (T5) based model generates an output made up of asymmetric sentences from structured inputs. In other words, these sentences were different in length and quality. We call this phenomenon "Asymmetric Generation" and we exploit this in D2T generation. Once asymmetric sentences are generated, we add the first part of the output with a no-repeated-target. As this goes through progressive edit (ProEdit), the recall increases. Hence, this method better covers structured inputs than before editing. ProEdit is a simple but effective way to improve performance in D2T generation and it achieves the new stateof-the-art result on the ToTTo dataset

下载PDF全文

下载文献需遵守相关版权规定

论文标题