焦点：朝向实用的交互式图像分割

论文标题

焦点：朝向实用的交互式图像分割

FocalClick: Towards Practical Interactive Image Segmentation

论文作者

Chen, Xi, Zhao, Zhiyan, Zhang, Yilei, Duan, Manni, Qi, Donglian, Zhao, Hengshuang

论文摘要

交互式细分允许用户通过进行正/负面点击来提取目标口罩。尽管许多以前的作品都探索了学术方法和工业需求之间仍然存在差距：首先，现有模型不足以在低功率设备上工作；其次，当用来完善先前的口罩时，它们的表现不佳，因为他们无法避免破坏正确的部分。 FocalClick通过预测和更新本地化区域的口罩立即解决这两个问题。为了提高效率，我们将整个图像的缓慢预测分解为小型农作物的两个快速推断：对目标作物进行粗分割，以及对重点作物的局部细化。为了使模型具有先前存在的掩码，我们制定了一个子任务称为交互式掩码校正，并提出了渐进式合并为解决方案。渐进式合并利用形态信息以决定保留何处和在哪里更新，使用户能够有效地完善任何先前存在的面具。 FocalClick针对SOTA方法的竞争成果明显较小。在先前存在的面具上进行更正时，它还显示出明显的优势。代码和数据将在github.com/xavierchen34/clickseg上发布

Interactive segmentation allows users to extract target masks by making positive/negative clicks. Although explored by many previous works, there is still a gap between academic approaches and industrial needs: first, existing models are not efficient enough to work on low power devices; second, they perform poorly when used to refine preexisting masks as they could not avoid destroying the correct part. FocalClick solves both issues at once by predicting and updating the mask in localized areas. For higher efficiency, we decompose the slow prediction on the entire image into two fast inferences on small crops: a coarse segmentation on the Target Crop, and a local refinement on the Focus Crop. To make the model work with preexisting masks, we formulate a sub-task termed Interactive Mask Correction, and propose Progressive Merge as the solution. Progressive Merge exploits morphological information to decide where to preserve and where to update, enabling users to refine any preexisting mask effectively. FocalClick achieves competitive results against SOTA methods with significantly smaller FLOPs. It also shows significant superiority when making corrections on preexisting masks. Code and data will be released at github.com/XavierCHEN34/ClickSEG

下载PDF全文

下载文献需遵守相关版权规定

论文标题