MM-Locate-News：新闻中的多模式焦点位置估计

论文标题

MM-Locate-News：新闻中的多模式焦点位置估计

MM-Locate-News: Multimodal Focus Location Estimation in News

论文作者

Tahmasebzadeh, Golsa, Müller-Budack, Eric, Hakimov, Sherzod, Ewerth, Ralph

论文摘要

随着网络已成为最具影响力的信息媒介，新闻的消费发生了很大变化。为了分析和背景每天发布大量新闻，文章的地理重点是一个重要方面，以实现基于内容的新闻检索。有一些方法和数据集用于从文本或照片中进行地理定位估算，但通常被视为单独的任务。但是，这张照片可能缺乏地理提示，文字可能包括多个位置，因此使用单个模态识别焦点位置的挑战。在本文中，引入了一个名为“新闻的多模式焦点位置”（MM-Locate-News）的新型数据集。我们在新的基准数据集上评估了最新方法，并建议使用文本和图像内容预测新闻的重点位置的新颖模型。实验结果表明，多模式模型的表现优于单峰模型。

The consumption of news has changed significantly as the Web has become the most influential medium for information. To analyze and contextualize the large amount of news published every day, the geographic focus of an article is an important aspect in order to enable content-based news retrieval. There are methods and datasets for geolocation estimation from text or photos, but they are typically considered as separate tasks. However, the photo might lack geographical cues and text can include multiple locations, making it challenging to recognize the focus location using a single modality. In this paper, a novel dataset called Multimodal Focus Location of News (MM-Locate-News) is introduced. We evaluate state-of-the-art methods on the new benchmark dataset and suggest novel models to predict the focus location of news using both textual and image content. The experimental results show that the multimodal model outperforms unimodal models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题