通过透明初始化和稀疏编码器，用超像素来完善语义分割

论文标题

通过透明初始化和稀疏编码器，用超像素来完善语义分割

Refining Semantic Segmentation with Superpixel by Transparent Initialization and Sparse Encoder

论文作者

Xu, Zhiwei, Ajanthan, Thalaiyasingam, Hartley, Richard

论文摘要

尽管深度学习极大地改善了语义细分的性能，但其成功主要在于对象中心区域而没有准确的边缘。由于Superpixels是保留对象边缘的流行且有效的辅助，因此，我们共同学习使用可训练的超像素的语义分割。我们使用具有透明初始化（TI）的完全连接层和使用稀疏编码器的有效logit一致性来实现它。所提出的TI保留了验证网络的学习参数的影响。这避免了预算网络丢失的显着增加，否则这可能是由于其他层的不适当参数初始化引起的。同时，每个超像素中的一致像素标签都可以通过logit一致性来保证。具有稀疏基质操作的稀疏编码器大大降低了内存需求和计算复杂性。我们证明了Ti优于其他参数初始化方法，并测试了其数值稳定性。我们的提案的有效性在Pascal VOC 2012，ADE20K和Pascal环境中得到了验证，显示了增强的语义细分边缘。通过使用性能比和F量表对细分边缘进行定量评估，我们的方法优于最新方法。

Although deep learning greatly improves the performance of semantic segmentation, its success mainly lies in object central areas without accurate edges. As superpixels are a popular and effective auxiliary to preserve object edges, in this paper, we jointly learn semantic segmentation with trainable superpixels. We achieve it with fully-connected layers with Transparent Initialization (TI) and efficient logit consistency using a sparse encoder. The proposed TI preserves the effects of learned parameters of pretrained networks. This avoids a significant increase of the loss of pretrained networks, which otherwise may be caused by inappropriate parameter initialization of the additional layers. Meanwhile, consistent pixel labels in each superpixel are guaranteed by logit consistency. The sparse encoder with sparse matrix operations substantially reduces both the memory requirement and the computational complexity. We demonstrated the superiority of TI over other parameter initialization methods and tested its numerical stability. The effectiveness of our proposal was validated on PASCAL VOC 2012, ADE20K, and PASCAL Context showing enhanced semantic segmentation edges. With quantitative evaluations on segmentation edges using performance ratio and F-measure, our method outperforms the state-of-the-art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题