论文标题
MAPQA:一个数据集,以回答Choropleth Maps上的问题
MapQA: A Dataset for Question Answering on Choropleth Maps
论文作者
论文摘要
Choropleth Maps是针对特定区域表格数据的常见视觉表示形式,可用于许多不同的场所(报纸,文章等)。这些地图是可读的,但在尝试为屏幕读取器,分析或其他相关任务提取数据时通常具有挑战性。最近对视觉问题答案(VQA)的研究研究了有关人类生成图表(ChartQA)(例如栏,线和饼图)的问题。但是,很少的工作关注理解地图。当要求执行此任务时,一般VQA模型和ChartQA模型会受到影响。为了促进和鼓励在这一领域进行研究,我们提出了MAPQA,这是一个〜60k地图图像的大规模数据集的大规模数据集。我们的任务测试了地图的各种级别的理解,从有关地图样式的表面问题到需要对基础数据进行推理的复杂问题。我们提出了MAPQA的独特挑战,该挑战挫败了为ChartQA和一般VQA任务设计的最强基线算法。我们还提出了一种新型算法,基于MAPQA的基于Visual Multi-Extup数据提取的QA(V-ModeQA)。 V-ModeQA使用多输出模型从MAP图像中提取基本的结构化数据,然后对提取的数据进行推理。我们的实验结果表明,通过捕获MAP问题回答中的唯一属性,V-ModeQA在MAPQA上具有比最新的ChartQA和VQA算法更好的总体性能和鲁棒性。
Choropleth maps are a common visual representation for region-specific tabular data and are used in a number of different venues (newspapers, articles, etc). These maps are human-readable but are often challenging to deal with when trying to extract data for screen readers, analyses, or other related tasks. Recent research into Visual-Question Answering (VQA) has studied question answering on human-generated charts (ChartQA), such as bar, line, and pie charts. However, little work has paid attention to understanding maps; general VQA models, and ChartQA models, suffer when asked to perform this task. To facilitate and encourage research in this area, we present MapQA, a large-scale dataset of ~800K question-answer pairs over ~60K map images. Our task tests various levels of map understanding, from surface questions about map styles to complex questions that require reasoning on the underlying data. We present the unique challenges of MapQA that frustrate most strong baseline algorithms designed for ChartQA and general VQA tasks. We also present a novel algorithm, Visual Multi-Output Data Extraction based QA (V-MODEQA) for MapQA. V-MODEQA extracts the underlying structured data from a map image with a multi-output model and then performs reasoning on the extracted data. Our experimental results show that V-MODEQA has better overall performance and robustness on MapQA than the state-of-the-art ChartQA and VQA algorithms by capturing the unique properties in map question answering.