论文标题
SD-CONV:迈向动态卷积的参数效率
SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution
论文作者
论文摘要
动态卷积可在有效的CNN中取得更好的性能,以可忽略不计的成本增加。但是,性能提高无法匹配显着扩展的参数数量,这是现实世界应用中的主要瓶颈。相反,基于掩模的非结构化修剪通过删除重型网络中的冗余获得了轻量级网络。在本文中,我们提出了一个新的框架,即\ textbf {稀疏动态卷积}(\ textsc {sd-conv})自然整合了这两个路径,以便它可以继承动态机制和稀疏性的优势。我们首先设计了一个二进制掩码,该面膜从可学习的阈值到修剪静态内核中得出,大大降低了参数和计算成本,但在Imagenet-1k中实现了更高的性能。我们进一步将预估计的模型转移到了各种下游任务中,比基线表现出更好的结果。我们希望我们的SD-CONV可以是传统动态卷积的有效替代方法。
Dynamic convolution achieves better performance for efficient CNNs at the cost of negligible FLOPs increase. However, the performance increase can not match the significantly expanded number of parameters, which is the main bottleneck in real-world applications. Contrastively, mask-based unstructured pruning obtains a lightweight network by removing redundancy in the heavy network. In this paper, we propose a new framework, \textbf{Sparse Dynamic Convolution} (\textsc{SD-Conv}), to naturally integrate these two paths such that it can inherit the advantage of dynamic mechanism and sparsity. We first design a binary mask derived from a learnable threshold to prune static kernels, significantly reducing the parameters and computational cost but achieving higher performance in Imagenet-1K. We further transfer pretrained models into a variety of downstream tasks, showing consistently better results than baselines. We hope our SD-Conv could be an efficient alternative to conventional dynamic convolutions.