多文件摘要，最大边缘相关性引导学习

论文标题

多文件摘要，最大边缘相关性引导学习

Multi-document Summarization with Maximal Marginal Relevance-guided Reinforcement Learning

论文作者

Mao, Yuning, Qu, Yanru, Xie, Yiqing, Ren, Xiang, Han, Jiawei

论文摘要

尽管神经序列学习方法在单案概述摘要（SDS）方面取得了重大进展，但它们在多文章摘要（MDS）上产生了不令人满意的结果。当将SDS改编为MDS时，我们会观察到两个主要的挑战：（1）MD涉及更大的搜索空间和更有限的培训数据，设定了神经方法的障碍，以学习足够的表示；（2）MDS需要解决源文档之间更高的信息冗余，而SDS方法的处理效率较低。为了缩小差距，我们介绍了MDS的RL-MMR，最大边缘相关性引导的增强学习，该学习统一了经典MDS中使用的先进的神经SDS方法和统计措施。 RL-MMR对更少有希望的候选人的MMR指南构成了指导，从而限制了搜索空间，从而导致更好的表示学习。此外，MMR中的显式冗余度量有助于摘要的神经表示，以更好地捕获冗余。广泛的实验表明，RL-MMR在基准MDS数据集上实现了最先进的性能。特别是，我们在学习有效性和效率方面将SDS适应MD时，将MMR纳入端到端学习的好处。

While neural sequence learning methods have made significant progress in single-document summarization (SDS), they produce unsatisfactory results on multi-document summarization (MDS). We observe two major challenges when adapting SDS advances to MDS: (1) MDS involves larger search space and yet more limited training data, setting obstacles for neural methods to learn adequate representations; (2) MDS needs to resolve higher information redundancy among the source documents, which SDS methods are less effective to handle. To close the gap, we present RL-MMR, Maximal Margin Relevance-guided Reinforcement Learning for MDS, which unifies advanced neural SDS methods and statistical measures used in classical MDS. RL-MMR casts MMR guidance on fewer promising candidates, which restrains the search space and thus leads to better representation learning. Additionally, the explicit redundancy measure in MMR helps the neural representation of the summary to better capture redundancy. Extensive experiments demonstrate that RL-MMR achieves state-of-the-art performance on benchmark MDS datasets. In particular, we show the benefits of incorporating MMR into end-to-end learning when adapting SDS to MDS in terms of both learning effectiveness and efficiency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题