supmmd：使用最大平均差异提取摘要的句子重要性模型

论文标题

supmmd：使用最大平均差异提取摘要的句子重要性模型

SupMMD: A Sentence Importance Model for Extractive Summarization using Maximum Mean Discrepancy

论文作者

Bista, Umanga, Mathews, Alexander Patrick, Menon, Aditya Krishna, Xie, Lexing

论文摘要

多文件摘要的大多数工作都集中在每个单独文档集中存在的信息的通用汇总。但是，更新摘要的探索设置不足，目的是确定每组中存在的新信息，这是同等的实际兴趣（例如，向读者提供有关不断发展的新闻主题的更新）。在这项工作中，我们提出了SUPMMD，这是一种基于内核二样本测试的最大平均差异，用于通用和更新摘要的新技术。 Supmmd结合了有关显着性的监督学习，也将无监督的学习涵盖了覆盖范围和多样性。此外，我们适应了多个内核学习，以利用多个信息源（例如文本特征和基于知识的概念）的相似性。我们通过满足或超过DUC-2004和TAC-2009数据集上的最新目前的最新时间来显示SUPMMD在通用和更新摘要任务中的功效。

Most work on multi-document summarization has focused on generic summarization of information present in each individual document set. However, the under-explored setting of update summarization, where the goal is to identify the new information present in each set, is of equal practical interest (e.g., presenting readers with updates on an evolving news topic). In this work, we present SupMMD, a novel technique for generic and update summarization based on the maximum mean discrepancy from kernel two-sample testing. SupMMD combines both supervised learning for salience and unsupervised learning for coverage and diversity. Further, we adapt multiple kernel learning to make use of similarity across multiple information sources (e.g., text features and knowledge based concepts). We show the efficacy of SupMMD in both generic and update summarization tasks by meeting or exceeding the current state-of-the-art on the DUC-2004 and TAC-2009 datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题