论文标题
通过平行的卷积卷积神经网络的癫痫发作检测和预测
Seizure Detection and Prediction by Parallel Memristive Convolutional Neural Networks
论文作者
论文摘要
在过去的二十年中,癫痫发作检测和预测算法迅速发展。然而,尽管性能得到了重大改进,但使用传统技术(例如互补的金属氧化物 - 官方导体(CMOS))在功率和面积受限的设置中实施硬件仍然是一项艰巨的任务;特别是当使用许多录音频道时。在本文中,我们提出了一种新型的低延迟平行卷积神经网络(CNN)体系结构,与SOTA CNN体系结构相比,网络参数少2-2,800倍,并且在使用癫痫发作的epirectiz andne prectife predictions的Bonn中获得99.84%的5倍交叉验证准确性,并获得99.01%的bone,并获得99.0.54%的SECIIS,并获得97..54%的SEIIIS。脑电图(EEG),CHB-MIT和Swec-Ethz癫痫发作数据集。随后,我们将网络实施到包含电阻随机访问内存(RRAM)设备的模拟横梁阵列上,并通过模拟,布置和确定系统中CNN组件的硬件要求来提供全面的基准。 To the best of our knowledge, we are the first to parallelize the execution of convolution layer kernels on separate analog crossbars to enable 2 orders of magnitude reduction in latency compared to SOTA hybrid Memristive-CMOS DL accelerators.此外,我们研究了非理想性对系统的影响,并研究了由于ADC/DAC分辨率较低而导致的量化意识培训(QAT),以减轻性能下降。最后,我们提出了一种卡住的重量抵消方法,以减轻因卡住的Ron/Roff Memristor重量而导致的性能降解,而无需再进行重新培训而恢复了高达32%的精度。估计我们平台的CNN组件估计在22nm FDSOI CMOS流程中占据31.255mm $^2 $的面积约为2.791W。
During the past two decades, epileptic seizure detection and prediction algorithms have evolved rapidly. However, despite significant performance improvements, their hardware implementation using conventional technologies, such as Complementary Metal-Oxide-Semiconductor (CMOS), in power and area-constrained settings remains a challenging task; especially when many recording channels are used. In this paper, we propose a novel low-latency parallel Convolutional Neural Network (CNN) architecture that has between 2-2,800x fewer network parameters compared to SOTA CNN architectures and achieves 5-fold cross validation accuracy of 99.84% for epileptic seizure detection, and 99.01% and 97.54% for epileptic seizure prediction, when evaluated using the University of Bonn Electroencephalogram (EEG), CHB-MIT and SWEC-ETHZ seizure datasets, respectively. We subsequently implement our network onto analog crossbar arrays comprising Resistive Random-Access Memory (RRAM) devices, and provide a comprehensive benchmark by simulating, laying out, and determining hardware requirements of the CNN component of our system. To the best of our knowledge, we are the first to parallelize the execution of convolution layer kernels on separate analog crossbars to enable 2 orders of magnitude reduction in latency compared to SOTA hybrid Memristive-CMOS DL accelerators. Furthermore, we investigate the effects of non-idealities on our system and investigate Quantization Aware Training (QAT) to mitigate the performance degradation due to low ADC/DAC resolution. Finally, we propose a stuck weight offsetting methodology to mitigate performance degradation due to stuck RON/ROFF memristor weights, recovering up to 32% accuracy, without requiring retraining. The CNN component of our platform is estimated to consume approximately 2.791W of power while occupying an area of 31.255mm$^2$ in a 22nm FDSOI CMOS process.