论文标题
生成具有本地对象注意和全局语义上下文建模的顺序图像的描述
Generating Descriptions for Sequential Images with Local-Object Attention and Global Semantic Context Modelling
论文作者
论文摘要
在本文中,我们提出了一个端到端CNN-LSTM模型,用于生成具有局部对象注意机制的顺序图像的描述。为了生成连贯的描述,我们使用多层感知器捕获全局语义上下文,该概念可以学习顺序图像之间的依赖关系。利用并行的LSTM网络来解码序列描述。实验结果表明,我们的模型在Microsoft发布的数据集中的三个不同评估指标上优于基线。
In this paper, we propose an end-to-end CNN-LSTM model for generating descriptions for sequential images with a local-object attention mechanism. To generate coherent descriptions, we capture global semantic context using a multi-layer perceptron, which learns the dependencies between sequential images. A paralleled LSTM network is exploited for decoding the sequence descriptions. Experimental results show that our model outperforms the baseline across three different evaluation metrics on the datasets published by Microsoft.