论文标题

海格 - 手势识别图像数据集

HaGRID - HAnd Gesture Recognition Image Dataset

论文作者

Kapitanov, Alexander, Kvanchiani, Karina, Nagaev, Alexander, Kraynov, Roman, Makhliarchuk, Andrei

论文摘要

本文介绍了一个巨大的数据集,即Hagrid(手势识别图像数据集),以构建一个手势识别(HGR)系统,该系统专注于与设备进行交互的交互。这就是为什么所有18个选择的手势都具有符号功能,并且可以解释为特定动作的原因。尽管手势是静态的,但它们被拾取了,尤其是为了设计几种动态手势的能力。它允许训练有素的模型不仅识别诸如“类似”和“停止”之类的静态手势,还可以识别“踩踏”和“拖动”动态手势。海格包含带有手势标签的554,800张图像和边界框注释,以求解手动检测和手势分类任务。其他数据集的上下文和主题的差异很小是创建数据集而无需限制的原因。利用众包平台使我们能够在至少在各种自然光条件下从0.5至4米处的距离距离的许多场景中收集37,583名受试者记录的样品。在消融研究实验中评估了多样性特征的影响。另外,我们证明了在HGR任务中用于预训练模型的海格里德能力。公开可用的海格和预算模型。

This paper introduces an enormous dataset, HaGRID (HAnd Gesture Recognition Image Dataset), to build a hand gesture recognition (HGR) system concentrating on interaction with devices to manage them. That is why all 18 chosen gestures are endowed with the semiotic function and can be interpreted as a specific action. Although the gestures are static, they were picked up, especially for the ability to design several dynamic gestures. It allows the trained model to recognize not only static gestures such as "like" and "stop" but also "swipes" and "drag and drop" dynamic gestures. The HaGRID contains 554,800 images and bounding box annotations with gesture labels to solve hand detection and gesture classification tasks. The low variability in context and subjects of other datasets was the reason for creating the dataset without such limitations. Utilizing crowdsourcing platforms allowed us to collect samples recorded by 37,583 subjects in at least as many scenes with subject-to-camera distances from 0.5 to 4 meters in various natural light conditions. The influence of the diversity characteristics was assessed in ablation study experiments. Also, we demonstrate the HaGRID ability to be used for pretraining models in HGR tasks. The HaGRID and pretrained models are publicly available.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源