重新思考伪LIDAR代表

论文标题

重新思考伪LIDAR代表

Rethinking Pseudo-LiDAR Representation

论文作者

Ma, Xinzhu, Liu, Shinan, Xia, Zhiyi, Zhang, Hongwen, Zeng, Xingyu, Ouyang, Wanli

论文摘要

最近提出的基于伪LIDA的3D检测器大大改善了单眼/立体3D检测任务的基准。但是，研究界的基本机制仍然晦涩难懂。在本文中，我们进行了深入的研究，并观察到伪LIDAR表示的功效来自坐标转换，而不是数据表示本身。基于此观察结果，我们设计了一个基于图像的CNN检测器，名为Patch-Net，该检测器更具概括性，可以作为基于伪LIDAR的3D检测器进行实例化。此外，我们的PatchNet中的伪LIDAR数据被组织为图像表示形式，这意味着可以轻松利用现有的2D CNN设计来从输入数据中提取深度功能并增强3D检测性能。我们在挑战性的Kitti数据集上进行了广泛的实验，该数据集在其中提议的PatchNet优于所有现有的基于伪LIDAR的同行。代码已在以下网址提供：https：//github.com/xinzhuma/patchnet。

The recently proposed pseudo-LiDAR based 3D detectors greatly improve the benchmark of monocular/stereo 3D detection task. However, the underlying mechanism remains obscure to the research community. In this paper, we perform an in-depth investigation and observe that the efficacy of pseudo-LiDAR representation comes from the coordinate transformation, instead of data representation itself. Based on this observation, we design an image based CNN detector named Patch-Net, which is more generalized and can be instantiated as pseudo-LiDAR based 3D detectors. Moreover, the pseudo-LiDAR data in our PatchNet is organized as the image representation, which means existing 2D CNN designs can be easily utilized for extracting deep features from input data and boosting 3D detection performance. We conduct extensive experiments on the challenging KITTI dataset, where the proposed PatchNet outperforms all existing pseudo-LiDAR based counterparts. Code has been made available at: https://github.com/xinzhuma/patchnet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题