Crisisltlsum：当地危机活动时间表提取和摘要的基准

论文标题

Crisisltlsum：当地危机活动时间表提取和摘要的基准

CrisisLTLSum: A Benchmark for Local Crisis Event Timeline Extraction and Summarization

论文作者

Faghihi, Hossein Rajaby, Alhafni, Bashar, Zhang, Ke, Ran, Shihao, Tetreault, Joel, Jaimes, Alejandro

论文摘要

社交媒体越来越多地在紧急响应中发挥了关键作用：第一响应者可以使用公共帖子来更好地应对正在进行的危机事件，并在最需要的地方部署必要的资源。时间轴提取和抽象性摘要是重要的技术任务，以利用大量有关事件的社交媒体帖子。不幸的是，对于这些任务，很少有用于基准技术方法的数据集。本文介绍了Crisisltlsum，这是迄今为止可用的当地危机活动时间表的最大数据集。 Crisisltlsum在四个领域中包含1,000个危机活动时间表：野火，当地火灾，交通和风暴。我们使用半自动化的群集 - 然后refine方法从公共Twitter流中收集数据。我们的最初实验表明，与两项任务的人类绩效相比，强基础的性能之间存在显着差距。我们的数据集，代码和模型公开可用。

Social media has increasingly played a key role in emergency response: first responders can use public posts to better react to ongoing crisis events and deploy the necessary resources where they are most needed. Timeline extraction and abstractive summarization are critical technical tasks to leverage large numbers of social media posts about events. Unfortunately, there are few datasets for benchmarking technical approaches for those tasks. This paper presents CrisisLTLSum, the largest dataset of local crisis event timelines available to date. CrisisLTLSum contains 1,000 crisis event timelines across four domains: wildfires, local fires, traffic, and storms. We built CrisisLTLSum using a semi-automated cluster-then-refine approach to collect data from the public Twitter stream. Our initial experiments indicate a significant gap between the performance of strong baselines compared to the human performance on both tasks. Our dataset, code, and models are publicly available.

下载PDF全文

下载文献需遵守相关版权规定

论文标题