跨域面部表达识别：统一的评估基准和对抗图学习

论文标题

跨域面部表达识别：统一的评估基准和对抗图学习

Cross-Domain Facial Expression Recognition: A Unified Evaluation Benchmark and Adversarial Graph Learning

论文作者

Chen, Tianshui, Pu, Tao, Wu, Hefeng, Xie, Yuan, Liu, Lingbo, Lin, Liang

论文摘要

为了解决不同面部表达识别（FER）数据集之间数据不一致的问题，近年来已经广泛设计了许多跨域FER方法（CD-FERS）。尽管每个声明要取得出色的性能，但由于源/目标数据集和特征提取器的选择不一致，缺乏公平的比较。在这项工作中，我们首先分析了这些不一致的选择引起的性能效应，然后重新实现了一些表现良好的CD-FER和最近发表的域适应算法。我们确保所有这些算法采用相同的源数据集和功能提取器进行公平的CD-FER评估。我们发现，当前大多数领先的算法都使用对抗性学习来学习整体域不变特征来减轻域的变化。但是，这些算法忽略了本地功能，这些特征在不同的数据集中更可传输，并携带更详细的内容以进行细粒度适应。为了解决这些问题，我们通过开发一种新型的对抗图表示适应（AGRA）框架，将图表表示传播与对抗性学习进行了跨域的整体局部特征共同适应。具体而言，它首先构建了两个图，以分别将每个域内和跨不同域内的整体区域和局部区域相关联。然后，它从输入图像中提取整体本地特征，并使用可学习的每个统计分布来初始化相应的图形节点。最后，采用了两个堆叠的图形卷积网络（GCN）来传播每个域内的整体本地特征，以探索它们的相互作用，并跨越不同的域进行整体局部特征共同适应。我们对几个流行的基准进行了广泛而公平的评估，并表明拟议的Agra框架的表现优于先前的最新方法。

To address the problem of data inconsistencies among different facial expression recognition (FER) datasets, many cross-domain FER methods (CD-FERs) have been extensively devised in recent years. Although each declares to achieve superior performance, fair comparisons are lacking due to the inconsistent choices of the source/target datasets and feature extractors. In this work, we first analyze the performance effect caused by these inconsistent choices, and then re-implement some well-performing CD-FER and recently published domain adaptation algorithms. We ensure that all these algorithms adopt the same source datasets and feature extractors for fair CD-FER evaluations. We find that most of the current leading algorithms use adversarial learning to learn holistic domain-invariant features to mitigate domain shifts. However, these algorithms ignore local features, which are more transferable across different datasets and carry more detailed content for fine-grained adaptation. To address these issues, we integrate graph representation propagation with adversarial learning for cross-domain holistic-local feature co-adaptation by developing a novel adversarial graph representation adaptation (AGRA) framework. Specifically, it first builds two graphs to correlate holistic and local regions within each domain and across different domains, respectively. Then, it extracts holistic-local features from the input image and uses learnable per-class statistical distributions to initialize the corresponding graph nodes. Finally, two stacked graph convolution networks (GCNs) are adopted to propagate holistic-local features within each domain to explore their interaction and across different domains for holistic-local feature co-adaptation. We conduct extensive and fair evaluations on several popular benchmarks and show that the proposed AGRA framework outperforms previous state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题