面部作用单元检测到全球 - 局部表达嵌入的帮助

论文标题

面部作用单元检测到全球 - 局部表达嵌入的帮助

Facial Action Units Detection Aided by Global-Local Expression Embedding

论文作者

Hu, Zhipeng, Zhang, Wei, Li, Lincheng, Ding, Yu, Chen, Wei, Deng, Zhigang, Yu, Xin

论文摘要

由于面部动作单元（AU）注释需要域专业知识，因此普通AU数据集仅包含有限数量的主题。结果，对AU检测的关键挑战是解决过度拟合的身份。我们发现AUS和面部表情高度关联，现有的面部表达数据集通常包含大量身份。在本文中，我们旨在利用无AU标签的表达数据集来促进AU检测。具体而言，我们开发了一个新型的AU检测框架，该框架在嵌入的全局本地面部表情（被称为Glee-net）的帮助下提供了帮助。我们的欢乐网络由三个分支组成，用于提取非身份的表达特征以进行AU检测。我们引入了一个全球分支，用于对整体面部表达进行建模，同时消除身份的影响。我们还设计了一个专注于特定本地面积区域的本地分支机构。首先将全局分支和本地分支的组合输出预先训练在表达数据集上，作为独立的表达式嵌入，然后在AU数据集上进行填充。因此，我们大大减轻了身份有限的问题。此外，我们引入了一个3D全局分支，该分支通过3D面重建提取表达系数以巩固2D AU描述。最后，采用基于变压器的多标签分类器来融合所有用于AU检测的表示。广泛的实验表明，我们的方法在广泛使用的DISFA，BP4D和BP4D+数据集上大大优于最先进的实验。

Since Facial Action Unit (AU) annotations require domain expertise, common AU datasets only contain a limited number of subjects. As a result, a crucial challenge for AU detection is addressing identity overfitting. We find that AUs and facial expressions are highly associated, and existing facial expression datasets often contain a large number of identities. In this paper, we aim to utilize the expression datasets without AU labels to facilitate AU detection. Specifically, we develop a novel AU detection framework aided by the Global-Local facial Expressions Embedding, dubbed GLEE-Net. Our GLEE-Net consists of three branches to extract identity-independent expression features for AU detection. We introduce a global branch for modeling the overall facial expression while eliminating the impacts of identities. We also design a local branch focusing on specific local face regions. The combined output of global and local branches is firstly pre-trained on an expression dataset as an identity-independent expression embedding, and then finetuned on AU datasets. Therefore, we significantly alleviate the issue of limited identities. Furthermore, we introduce a 3D global branch that extracts expression coefficients through 3D face reconstruction to consolidate 2D AU descriptions. Finally, a Transformer-based multi-label classifier is employed to fuse all the representations for AU detection. Extensive experiments demonstrate that our method significantly outperforms the state-of-the-art on the widely-used DISFA, BP4D and BP4D+ datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题