论文标题
自我:从耳机安装的相机中估算3D以上的姿势
SelfPose: 3D Egocentric Pose Estimation from a Headset Mounted Camera
论文作者
论文摘要
我们提供了一种以下为中心的3D身体姿势估计的解决方案,该姿势是从安装在头部安装的VR设备边缘上的向下外观的鱼眼摄像机捕获的单眼图像。这种不寻常的观点会导致具有独特视觉外观的图像,具有严重的自我周围和透视扭曲,从而导致下半身和上半身之间的分辨率差异。我们提出了一种编码器架构,其新型多分支解码器旨在说明2D预测中不同的不确定性。关于合成和现实世界数据集的定量评估表明,我们的策略可以实质性地提高到以上为中心方法的状态。为了解决缺乏标记数据的问题,我们还引入了一个大型的照片现实合成数据集。 XR-Egopose提供了各种各样的肤色,身体形状和衣服的人的高质量效果图,并执行各种动作。我们的实验表明,我们新的合成训练语料库的高度变异性可很好地概括对现实世界的镜头,并在具有地面真理的现实世界数据集中产生了心灵的结果。此外,对人类36m基准的评估表明,我们方法的性能与第三人称观点的3D人姿势的最经典问题相当。
We present a solution to egocentric 3D body pose estimation from monocular images captured from downward looking fish-eye cameras installed on the rim of a head mounted VR device. This unusual viewpoint leads to images with unique visual appearance, with severe self-occlusions and perspective distortions that result in drastic differences in resolution between lower and upper body. We propose an encoder-decoder architecture with a novel multi-branch decoder designed to account for the varying uncertainty in 2D predictions. The quantitative evaluation, on synthetic and real-world datasets, shows that our strategy leads to substantial improvements in accuracy over state of the art egocentric approaches. To tackle the lack of labelled data we also introduced a large photo-realistic synthetic dataset. xR-EgoPose offers high quality renderings of people with diverse skintones, body shapes and clothing, performing a range of actions. Our experiments show that the high variability in our new synthetic training corpus leads to good generalization to real world footage and to state of theart results on real world datasets with ground truth. Moreover, an evaluation on the Human3.6M benchmark shows that the performance of our method is on par with top performing approaches on the more classic problem of 3D human pose from a third person viewpoint.