论文标题
朝着判别性和可转移的一阶段几射击对象探测器
Towards Discriminative and Transferable One-Stage Few-Shot Object Detectors
论文作者
论文摘要
最近的对象检测模型需要大量注释的数据来培训新的对象类别。很少有射击对象检测(FSOD)旨在通过仅在几个样本中学习新颖的课程来解决这个问题。尽管使用两阶段的FSOD探测器实现了竞争成果,但与之相比,通常是一阶段的FSOD表现不佳。我们观察到,两阶段和一阶段FSOD之间的性能差距很大,这主要是由于它们的可区分性较弱,这是通过一个小的融合后接受场和损失函数中的少量前景样本来解释的。为了解决这些局限性,我们提出了几种视网膜(FSRN),该视网膜(FSRN)由以下方式组成:一种多路支持培训策略,以增加密集的元式遗传器的前景样品数量,这是一种早期的多级功能融合,提供了广泛的接收场,可提供整个锚点区域和两种增强技术,涵盖了对查理和源图像的两种增强技术。广泛的实验表明,所提出的方法解决了局限性,并提高了可区分性和可传递性。 FSRN的速度几乎是两个阶段的FSOD速度的两倍,同时保持准确性的竞争力,并且优于一阶段元元素检测器的最先进,以及在MS-Coco和Pascal VOC基准上的一些两个阶段的FSOD。
Recent object detection models require large amounts of annotated data for training a new classes of objects. Few-shot object detection (FSOD) aims to address this problem by learning novel classes given only a few samples. While competitive results have been achieved using two-stage FSOD detectors, typically one-stage FSODs underperform compared to them. We make the observation that the large gap in performance between two-stage and one-stage FSODs are mainly due to their weak discriminability, which is explained by a small post-fusion receptive field and a small number of foreground samples in the loss function. To address these limitations, we propose the Few-shot RetinaNet (FSRN) that consists of: a multi-way support training strategy to augment the number of foreground samples for dense meta-detectors, an early multi-level feature fusion providing a wide receptive field that covers the whole anchor area and two augmentation techniques on query and source images to enhance transferability. Extensive experiments show that the proposed approach addresses the limitations and boosts both discriminability and transferability. FSRN is almost two times faster than two-stage FSODs while remaining competitive in accuracy, and it outperforms the state-of-the-art of one-stage meta-detectors and also some two-stage FSODs on the MS-COCO and PASCAL VOC benchmarks.