学习对象关系图和视觉导航的暂定策略

论文标题

学习对象关系图和视觉导航的暂定策略

Learning Object Relation Graph and Tentative Policy for Visual Navigation

论文作者

Du, Heming, Yu, Xin, Zheng, Liang

论文摘要

目标驱动的视觉导航旨在基于对代理的观察，将代理导航到给定目标。在此任务中，学习信息丰富的视觉表示和强大的导航政策至关重要。旨在改善这两个组件，本文提出了三种互补技术，对象关系图（org），试用驱动的模仿学习（IL）和一个由内存的启发性策略网络（TPN）。 org通过整合对象关系（包括类别的亲密关系和空间相关性）来改善视觉表示学习，例如，电视通常与空间上的遥控器共同占据。试用驱动的IL和TPN都是强大的导航政策的基础，指示代理人逃离死锁国家，例如循环或被困。具体而言，试用驱动的IL是政策网络培训中使用的一种监督，而TPN在测试中使用了模仿IL监督的TPN。在人工环境中的实验AI2-验证了每种技术都是有效的。合并后，这些技术在未见环境中的导航有效性和效率方面具有显着改善。我们报告的成功率和成功增加了22.8％和23.5％，分别按路径长度（SPL）加权。该代码可从https://github.com/xiaobaishu0097/eccv-vn.git获得。

Target-driven visual navigation aims at navigating an agent towards a given target based on the observation of the agent. In this task, it is critical to learn informative visual representation and robust navigation policy. Aiming to improve these two components, this paper proposes three complementary techniques, object relation graph (ORG), trial-driven imitation learning (IL), and a memory-augmented tentative policy network (TPN). ORG improves visual representation learning by integrating object relationships, including category closeness and spatial correlations, e.g., a TV usually co-occurs with a remote spatially. Both Trial-driven IL and TPN underlie robust navigation policy, instructing the agent to escape from deadlock states, such as looping or being stuck. Specifically, trial-driven IL is a type of supervision used in policy network training, while TPN, mimicking the IL supervision in unseen environment, is applied in testing. Experiment in the artificial environment AI2-Thor validates that each of the techniques is effective. When combined, the techniques bring significantly improvement over baseline methods in navigation effectiveness and efficiency in unseen environments. We report 22.8% and 23.5% increase in success rate and Success weighted by Path Length (SPL), respectively. The code is available at https://github.com/xiaobaishu0097/ECCV-VN.git.

下载PDF全文

下载文献需遵守相关版权规定

论文标题