基于注意的3D对象检测的提案改进

论文标题

基于注意的3D对象检测的提案改进

Attention-based Proposals Refinement for 3D Object Detection

论文作者

Dao, Minh-Quan, Héry, Elwan, Frémont, Vincent

论文摘要

3D对象检测的最新进展是通过开发基于体素的区域建议网络（RPN）的改进阶段来更好地取得了准确性和效率之间的平衡。最先进的框架之间一种流行的方法是将提议或感兴趣的区域（ROI）分为网格，并为每个网格位置提取特征，然后再合成它们以形成ROI功能。在实现令人印象深刻的表演的同时，这种方法涉及几个手工制作的组件（例如网格采样，设置抽象），这些组件需要正确调整专家知识。本文提出了一种数据驱动的ROI功能计算方法，名为APRO3D-NET，该计算由基于体素的RPN组成，并且由矢量注意的改进阶段组成。与原始的多头注意力不同，矢量的注意将不同的权重分配给点特征内的不同通道，从而能够捕获合并点和ROI之间的更复杂的关系。我们的方法在基蒂的验证集和47.03地图（平均超过10个班级）的竞争性能方面的竞争性能在nuscenes上的验证中适度难度，而与密切相关的方法相比，参数最少，并且在NVIDIA V100 v100 GPU上具有最小的参数。该代码在https://github.com/quan-dao/apro3d-net上发布。

Recent advances in 3D object detection are made by developing the refinement stage for voxel-based Region Proposal Networks (RPN) to better strike the balance between accuracy and efficiency. A popular approach among state-of-the-art frameworks is to divide proposals, or Regions of Interest (ROI), into grids and extract features for each grid location before synthesizing them to form ROI features. While achieving impressive performances, such an approach involves several hand-crafted components (e.g. grid sampling, set abstraction) which requires expert knowledge to be tuned correctly. This paper proposes a data-driven approach to ROI feature computing named APRO3D-Net which consists of a voxel-based RPN and a refinement stage made of Vector Attention. Unlike the original multi-head attention, Vector Attention assigns different weights to different channels within a point feature, thus being able to capture a more sophisticated relation between pooled points and ROI. Our method achieves a competitive performance of 84.85 AP for class Car at moderate difficulty on the validation set of KITTI and 47.03 mAP (average over 10 classes) on NuScenes while having the least parameters compared to closely related methods and attaining an inference speed at 15 FPS on NVIDIA V100 GPU. The code is released at https://github.com/quan-dao/APRO3D-Net.

下载PDF全文

下载文献需遵守相关版权规定

论文标题