生成具有本地对象注意和全局语义上下文建模的顺序图像的描述

论文标题

生成具有本地对象注意和全局语义上下文建模的顺序图像的描述

Generating Descriptions for Sequential Images with Local-Object Attention and Global Semantic Context Modelling

论文作者

Su, Jing, Lin, Chenghua, Zhou, Mian, Dai, Qingyun, Lv, Haoyu

论文摘要

在本文中，我们提出了一个端到端CNN-LSTM模型，用于生成具有局部对象注意机制的顺序图像的描述。为了生成连贯的描述，我们使用多层感知器捕获全局语义上下文，该概念可以学习顺序图像之间的依赖关系。利用并行的LSTM网络来解码序列描述。实验结果表明，我们的模型在Microsoft发布的数据集中的三个不同评估指标上优于基线。

In this paper, we propose an end-to-end CNN-LSTM model for generating descriptions for sequential images with a local-object attention mechanism. To generate coherent descriptions, we capture global semantic context using a multi-layer perceptron, which learns the dependencies between sequential images. A paralleled LSTM network is exploited for decoding the sequence descriptions. Experimental results show that our model outperforms the baseline across three different evaluation metrics on the datasets published by Microsoft.

下载PDF全文

下载文献需遵守相关版权规定

论文标题