论文标题

DNA:可区分的网络加速器共同搜索

DNA: Differentiable Network-Accelerator Co-Search

论文作者

Zhang, Yongan, Fu, Yonggan, Jiang, Weiwen, Li, Chaojian, You, Haoran, Li, Meng, Chandra, Vikas, Lin, Yingyan Celine

论文摘要

强大而复杂的深度神经网络(DNNS)促进了对有效的DNN解决方案的蓬勃发展的需求,以将DNN驱动的智能带入众多应用中。共同优化网络及其加速器在提供最佳性能方面有希望。但是,由于同时探索网络及其加速器的巨大而纠缠但又不同的设计空间的挑战,此类解决方案的巨大潜力尚未释放。为此,我们提出了DNA,即一种可自动搜索匹配的网络和加速器的可区分网络加速器共同搜索框架,以最大程度地提高任务准确性和加速度效率。具体而言,DNA集成了两个推动因素:(1)DNN加速器的通用设计空间,适用于FPGA和基于ASIC的DNN加速器,并与Pytorch等DNN框架兼容,以启用算法探索,以实现更有效的DNNS及其加速器; (2)一种联合DNN网络和加速器共搜索算法,该算法可以同时搜索最佳的DNN结构及其加速器的微构造方法和映射方法,以最大程度地提高任务准确性和加速度的效率。基于FPGA测量和ASIC合成的实验和消融研究表明,DNA产生的匹配的网络和加速器始终超过最先进的DNN(SOTA)DNNS和DNN加速器,例如,在3.04x上,较高的搜索时间超过了5.46%的搜索(在3.04x)上的搜索时间超过了123%(同时更高),同时还要降低了123%的时间。在三个数据集上评估十个SOTA基准时,共同探索方法。所有代码将在接受后发布。

Powerful yet complex deep neural networks (DNNs) have fueled a booming demand for efficient DNN solutions to bring DNN-powered intelligence into numerous applications. Jointly optimizing the networks and their accelerators are promising in providing optimal performance. However, the great potential of such solutions have yet to be unleashed due to the challenge of simultaneously exploring the vast and entangled, yet different design spaces of the networks and their accelerators. To this end, we propose DNA, a Differentiable Network-Accelerator co-search framework for automatically searching for matched networks and accelerators to maximize both the task accuracy and acceleration efficiency. Specifically, DNA integrates two enablers: (1) a generic design space for DNN accelerators that is applicable to both FPGA- and ASIC-based DNN accelerators and compatible with DNN frameworks such as PyTorch to enable algorithmic exploration for more efficient DNNs and their accelerators; and (2) a joint DNN network and accelerator co-search algorithm that enables simultaneously searching for optimal DNN structures and their accelerators' micro-architectures and mapping methods to maximize both the task accuracy and acceleration efficiency. Experiments and ablation studies based on FPGA measurements and ASIC synthesis show that the matched networks and accelerators generated by DNA consistently outperform state-of-the-art (SOTA) DNNs and DNN accelerators (e.g., 3.04x better FPS with a 5.46% higher accuracy on ImageNet), while requiring notably reduced search time (up to 1234.3x) over SOTA co-exploration methods, when evaluated over ten SOTA baselines on three datasets. All codes will be released upon acceptance.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源