在360个视频中重新访问光流估计

论文标题

在360个视频中重新访问光流估计

Revisiting Optical Flow Estimation in 360 Videos

论文作者

Bhandari, Keshav, Zong, Ziliang, Yan, Yan

论文摘要

如今，自高质量和低成本360可穿戴设备出现以来，360个视频分析已成为该领域的重要研究主题。在本文中，我们提出了一种新颖的LiteFlownet360架构，用于360个视频光流估计。我们将LiteFlownet360设计为从透视视频域到360视频域的域自适应框架。我们将其从受内核变压器网络（KTN）启发的简单内核变换技术进行调整，以应对由球形到平面投影引起的360个视频中的固有失真。首先，我们应用特征金字塔网络中卷积层的增量转换，并表明推理和正则化层中进一步的转换并不重要，因此从大小和计算成本方面降低了网络增长。其次，我们通过以有监督的方式使用增强数据培训来完善网络。我们通过将图像投射到球体中并重新投影到平面上来执行数据增强。第三，我们使用目标域360视频以一种自我监督的方式训练LiteFlownet360。实验结果表明，使用拟议的新颖体系结构，360个视频光流估计的有希望的结果。

Nowadays 360 video analysis has become a significant research topic in the field since the appearance of high-quality and low-cost 360 wearable devices. In this paper, we propose a novel LiteFlowNet360 architecture for 360 videos optical flow estimation. We design LiteFlowNet360 as a domain adaptation framework from perspective video domain to 360 video domain. We adapt it from simple kernel transformation techniques inspired by Kernel Transformer Network (KTN) to cope with inherent distortion in 360 videos caused by the sphere-to-plane projection. First, we apply an incremental transformation of convolution layers in feature pyramid network and show that further transformation in inference and regularization layers are not important, hence reducing the network growth in terms of size and computation cost. Second, we refine the network by training with augmented data in a supervised manner. We perform data augmentation by projecting the images in a sphere and re-projecting to a plane. Third, we train LiteFlowNet360 in a self-supervised manner using target domain 360 videos. Experimental results show the promising results of 360 video optical flow estimation using the proposed novel architecture.

下载PDF全文

下载文献需遵守相关版权规定

论文标题