论文标题

NPA:统一网络修剪和体系结构的编译器感知框架搜索超越实时移动加速

NPAS: A Compiler-aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration

论文作者

Li, Zhengang, Yuan, Geng, Niu, Wei, Zhao, Pu, Li, Yanyu, Cai, Yuxuan, Shen, Xuan, Zhan, Zheng, Kong, Zhenglun, Jin, Qing, Chen, Zhiyu, Liu, Sijia, Yang, Kaiyuan, Ren, Bin, Wang, Yanzhi, Lin, Xue

论文摘要

随着在移动边缘设备上有效部署DNN的需求不断增长,降低不必要的计算并提高执行速度变得越来越重要。实现此目标的先前方法,包括模型压缩和网络体系结构搜索(NAS),在很大程度上是独立执行的,并且没有完全考虑编译器级优化,这是移动加速度必须做的。在这项工作中,我们首先提出了(i)适用于各种DNN层的细粒结构化修剪的一般类别,以及(ii)一个全面的编译器自动代码生成框架,支持不同的DNN和不同的修剪方案,该方案弥合了模型压缩和NAS的间隙。我们进一步提出了NPA,这是一种编译器意识到的统一网络修剪和架构搜索。为了处理较大的搜索空间,我们提出了一个基于强化学习和快速评估和贝叶斯优化的元模型程序,以确保训练时期的总数与代表性的NAS框架相当。我们的框架达到6.7m,5.9ms,3.9ms Imagenet推断时间,分别为78.2%,75%(Mobilenet-V3级别)和71%(Mobilenet-V2水平)TOP-1精度在现成的手机上分别超过了先前的工作。

With the increasing demand to efficiently deploy DNNs on mobile edge devices, it becomes much more important to reduce unnecessary computation and increase the execution speed. Prior methods towards this goal, including model compression and network architecture search (NAS), are largely performed independently and do not fully consider compiler-level optimizations which is a must-do for mobile acceleration. In this work, we first propose (i) a general category of fine-grained structured pruning applicable to various DNN layers, and (ii) a comprehensive, compiler automatic code generation framework supporting different DNNs and different pruning schemes, which bridge the gap of model compression and NAS. We further propose NPAS, a compiler-aware unified network pruning, and architecture search. To deal with large search space, we propose a meta-modeling procedure based on reinforcement learning with fast evaluation and Bayesian optimization, ensuring the total number of training epochs comparable with representative NAS frameworks. Our framework achieves 6.7ms, 5.9ms, 3.9ms ImageNet inference times with 78.2%, 75% (MobileNet-V3 level), and 71% (MobileNet-V2 level) Top-1 accuracy respectively on an off-the-shelf mobile phone, consistently outperforming prior work.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源