掩盖语义细分的监督学习

论文标题

掩盖语义细分的监督学习

Masked Supervised Learning for Semantic Segmentation

论文作者

Zunair, Hasib, Hamza, A. Ben

论文摘要

自我注意力在语义细分中至关重要，因为它可以对远程上下文进行建模，从而转化为改善的性能。我们认为，建模短距离环境同样重要，尤其是解决不仅关注区域又小且模棱两可的情况，而且在语义类别之间存在不平衡时。为此，我们提出了掩盖的监督学习（MaskSup），这是一种有效的单阶段学习范式，对短期和远距离上下文进行建模，通过随机掩盖捕获像素之间的上下文关系。实验结果表明，在三个标准基准数据集上，蒙版对强基线的竞争性能在二进制和多级分段任务中具有强大的基准，尤其是在处理模棱两可的区域并保留少数群体的更好细分方面，而没有增加推理成本。除了分割目标区域外，即使掩盖了大部分输入，也可以轻松地集成到各种语义分割方法中。我们还表明，所提出的方法在计算上是有效的，在平均交叉联合工会（MIOU）上的性能提高了10 \％，同时需要$ 3 \ times $ $ $ $可学习的参数。

Self-attention is of vital importance in semantic segmentation as it enables modeling of long-range context, which translates into improved performance. We argue that it is equally important to model short-range context, especially to tackle cases where not only the regions of interest are small and ambiguous, but also when there exists an imbalance between the semantic classes. To this end, we propose Masked Supervised Learning (MaskSup), an effective single-stage learning paradigm that models both short- and long-range context, capturing the contextual relationships between pixels via random masking. Experimental results demonstrate the competitive performance of MaskSup against strong baselines in both binary and multi-class segmentation tasks on three standard benchmark datasets, particularly at handling ambiguous regions and retaining better segmentation of minority classes with no added inference cost. In addition to segmenting target regions even when large portions of the input are masked, MaskSup is also generic and can be easily integrated into a variety of semantic segmentation methods. We also show that the proposed method is computationally efficient, yielding an improved performance by 10\% on the mean intersection-over-union (mIoU) while requiring $3\times$ less learnable parameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题