论文标题
sumren:总结新闻中有关事件的报告的演讲
SumREN: Summarizing Reported Speech about Events in News
论文作者
论文摘要
新闻文章的主要目的是建立事件的事实记录,经常通过传达指定事件的细节(即5 ws;谁,什么,什么,何时,何时何地,何时何地,何时以及为什么)以及人们对此的反应(即报告的陈述)。但是,关于新闻摘要的现有工作几乎完全关注事件细节。在这项工作中,我们提出了总结不同说话者对给定事件的陈述所表达的新型任务。为此,我们创建了一个新的多文章摘要基准SUMREN,其中包括745个从633篇讨论132个事件的新闻文章中获得的各种公众人物报告的陈述摘要。我们为我们的任务提出了一种自动的银训练数据生成方法,该方法可以帮助BART等较小的模型在此任务上实现GPT-3级别的性能。最后,我们介绍了一个基于管道的框架,用于汇总报告的语音,我们从经验上表明,它比基线查询的摘要方法更为抽象和事实。
A primary objective of news articles is to establish the factual record for an event, frequently achieved by conveying both the details of the specified event (i.e., the 5 Ws; Who, What, Where, When and Why regarding the event) and how people reacted to it (i.e., reported statements). However, existing work on news summarization almost exclusively focuses on the event details. In this work, we propose the novel task of summarizing the reactions of different speakers, as expressed by their reported statements, to a given event. To this end, we create a new multi-document summarization benchmark, SUMREN, comprising 745 summaries of reported statements from various public figures obtained from 633 news articles discussing 132 events. We propose an automatic silver training data generation approach for our task, which helps smaller models like BART achieve GPT-3 level performance on this task. Finally, we introduce a pipeline-based framework for summarizing reported speech, which we empirically show to generate summaries that are more abstractive and factual than baseline query-focused summarization approaches.