Transppg：远程心率估计的两流变压器

论文标题

Transppg：远程心率估计的两流变压器

TransPPG: Two-stream Transformer for Remote Heart Rate Estimate

论文作者

Kang, Jiaqi, Yang, Su, Zhang, Weishan

论文摘要

使用远程光摄像学（RPPG）的非接触面部基于面部视频的心率估计在许多应用程序（例如远程医疗保健）中表现出巨大的潜力，并在受限的情况下获得了可信度的结果。但是，即使在复杂的环境下，头部运动和不稳定的照明，实际应用也需要准确的结果。因此，在复杂环境中提高RPPG的性能已成为一个关键挑战。在本文中，我们提出了一种新颖的视频嵌入方法，将每个面部视频序列嵌入到一个特征图中，称为具有重叠的多尺度自适应空间和时间图（MAST_MOP）（MAST_MOP），不仅包含重要的信息，还包含信息作为参考的信息，该信息围绕着镜子，可以在同源范围内稳定地构成稳定性的稳定性和背景。相应地，我们提出了一个两流变压器模型将MAST_MOP映射到心率（HR）中，其中一个流遵循面部区域中的脉冲信号，而另一个流则删除了来自周围区域的扰动信号，从而使两个通道的差异导致适应性噪声消除。我们的方法极大地胜过两个公共数据集Mahnob-HCI和VIPL-HR的所有当前最新方法。据我们所知，这是捕获RPPG中的时间依赖性并应用两个流方案的骨干的第一部作品，以找出来自背景的干扰，作为前景信号上相应扰动的镜像，以实现噪声耐受性。

Non-contact facial video-based heart rate estimation using remote photoplethysmography (rPPG) has shown great potential in many applications (e.g., remote health care) and achieved creditable results in constrained scenarios. However, practical applications require results to be accurate even under complex environment with head movement and unstable illumination. Therefore, improving the performance of rPPG in complex environment has become a key challenge. In this paper, we propose a novel video embedding method that embeds each facial video sequence into a feature map referred to as Multi-scale Adaptive Spatial and Temporal Map with Overlap (MAST_Mop), which contains not only vital information but also surrounding information as reference, which acts as the mirror to figure out the homogeneous perturbations imposed on foreground and background simultaneously, such as illumination instability. Correspondingly, we propose a two-stream Transformer model to map the MAST_Mop into heart rate (HR), where one stream follows the pulse signal in the facial area while the other figures out the perturbation signal from the surrounding region such that the difference of the two channels leads to adaptive noise cancellation. Our approach significantly outperforms all current state-of-the-art methods on two public datasets MAHNOB-HCI and VIPL-HR. As far as we know, it is the first work with Transformer as backbone to capture the temporal dependencies in rPPGs and apply the two stream scheme to figure out the interference from backgrounds as mirror of the corresponding perturbation on foreground signals for noise tolerating.

下载PDF全文

下载文献需遵守相关版权规定

论文标题