阿米尔：对人类姿势估计的对抗性多实例学习

论文标题

阿米尔：对人类姿势估计的对抗性多实例学习

AMIL: Adversarial Multi Instance Learning for Human Pose Estimation

论文作者

Shamsolmoali, Pourya, Zareapoor, Masoumeh, Zhou, Huiyu, Yang, Jie

论文摘要

人姿势估计对从人类计算机界面到监视和基于内容的视频检索的广泛应用有重要影响。对于人类的姿势估计，关节障碍物和对人体的重叠导致姿势估计。为了解决这些问题，通过整合人体结构的先验，我们提出了一个新颖的结构感知网络，以谨慎考虑在网络训练期间考虑此类先验。通常，学习这种约束是一项具有挑战性的任务。取而代之的是，我们将生成对抗网络作为我们的学习模型，在该模型中，我们设计了两个使用相同体系结构的残留多重实例学习（MIL）模型，一个模型被用作生成器，另一个用作鉴别器。歧视者的任务是将实际姿势与假姿势区分开。如果姿势生成器生成歧视者无法与真实的结果产生结果，则该模型已成功地学习了先验。在提议的模型中，鉴别器将地面热图与生成的热图区分开，后来对抗损失后反向传播到发电机。这样的过程有助于发电机学习合理的身体配置，并被证明是提高姿势估计精度的有益。同时，我们提出了MIL的新功能。它是实例选择和建模的可调节结构，可以适当地通过单个袋中的实例之间的信息。在拟议的残留MIL神经网络中，集合动作充分更新了实例对其包的贡献。基于合并的拟议的对抗性残差多企业神经网络已在两个数据集上进行了验证，以实现人类姿势估计任务，并成功地胜过其他最先进的模型。

Human pose estimation has an important impact on a wide range of applications from human-computer interface to surveillance and content-based video retrieval. For human pose estimation, joint obstructions and overlapping upon human bodies result in departed pose estimation. To address these problems, by integrating priors of the structure of human bodies, we present a novel structure-aware network to discreetly consider such priors during the training of the network. Typically, learning such constraints is a challenging task. Instead, we propose generative adversarial networks as our learning model in which we design two residual multiple instance learning (MIL) models with the identical architecture, one is used as the generator and the other one is used as the discriminator. The discriminator task is to distinguish the actual poses from the fake ones. If the pose generator generates the results that the discriminator is not able to distinguish from the real ones, the model has successfully learnt the priors. In the proposed model, the discriminator differentiates the ground-truth heatmaps from the generated ones, and later the adversarial loss back-propagates to the generator. Such procedure assists the generator to learn reasonable body configurations and is proved to be advantageous to improve the pose estimation accuracy. Meanwhile, we propose a novel function for MIL. It is an adjustable structure for both instance selection and modeling to appropriately pass the information between instances in a single bag. In the proposed residual MIL neural network, the pooling action adequately updates the instance contribution to its bag. The proposed adversarial residual multi-instance neural network that is based on pooling has been validated on two datasets for the human pose estimation task and successfully outperforms the other state-of-arts models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题