pepinesum：基于抽象意见摘要的基于基于邮件的自我训练

论文标题

pepinesum：基于抽象意见摘要的基于基于邮件的自我训练

OpineSum: Entailment-based self-training for abstractive opinion summarization

论文作者

Louis, Annie, Maynez, Joshua

论文摘要

典型的产品或地方通常会有数百个评论，这些文本的汇总是一个重要且具有挑战性的问题。在新闻等领域的抽象性摘要的最新进展是由对数十万新闻文章培训的监督系统驱动的，并与人写的摘要配对。但是，对于意见文本，很少有这样的大型数据集可用。无监督的方法，自我训练和少数学习方法桥梁差距。在这项工作中，我们提出了一种新颖的自我训练方法，即用于抽象意见摘要。这种方法中的摘要是使用新颖的文本含义应用构建的，并在项目的各种评论中捕获了意见共识。该方法可用于大规模获取银色标准的摘要，并训练无监督和少量抽象的摘要系统。 Pepinesum在这两种情况下都达到了最先进的性能。

A typical product or place often has hundreds of reviews, and summarization of these texts is an important and challenging problem. Recent progress on abstractive summarization in domains such as news has been driven by supervised systems trained on hundreds of thousands of news articles paired with human-written summaries. However for opinion texts, such large scale datasets are rarely available. Unsupervised methods, self-training, and few-shot learning approaches bridge that gap. In this work, we present a novel self-training approach, OpineSum, for abstractive opinion summarization. The summaries in this approach are built using a novel application of textual entailment and capture the consensus of opinions across the various reviews for an item. This method can be used to obtain silver-standard summaries on a large scale and train both unsupervised and few-shot abstractive summarization systems. OpineSum achieves state-of-the-art performance in both settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题