使用多尺度卷积复发网络的基于自动编码器的无监督入侵检测

论文标题

使用多尺度卷积复发网络的基于自动编码器的无监督入侵检测

Autoencoder-based Unsupervised Intrusion Detection using Multi-Scale Convolutional Recurrent Networks

论文作者

Singh, Amardeep, Jang-Jaccard, Julian

论文摘要

网络流量数据的大量增长导致大量数据集。标记这些数据集以识别入侵攻击非常费力且容易出错。此外，网络流量数据具有复杂的时变非线性关系。现有的最新入侵检测解决方案使用了各种监督方法以及基于流量数据相关的融合功能子集的组合。这些解决方案通常需要高计算成本，在微调入侵检测模型中进行手动支持以及限制网络流量实时处理的数据的标签。无监督的解决方案确实会减少标记数据的计算复杂性和手动支持，但是当前的无监督解决方案不考虑流量数据中的时空相关性。为了解决这个问题，我们提出了一个基于组合多尺度卷积神经网络和长期记忆（MSCNN-LSTM-AE）的统一自动编码器，以在网络流量中检测到异常。该模型首先采用多尺度卷积神经网络自动编码器（MSCNN-AE）来分析数据集的空间特征，然后从MSCNN-AE中学到的潜在空间特征采用了长期短期存储器（LSTM）自动录制器网络来处理时间功能。我们的模型进一步采用了两种隔离森林算法作为误差校正机制，以检测误报和假阴性以提高检测准确性。另外，协方差矩阵形成了一种riemannian歧管，该歧管自然嵌入了距离度量标准，可促进描述性模式以检测恶意网络流量。我们评估了NSL-KDD，UNSW-NB15和CICDDOS2019数据集的模型，并显示我们提出的方法的表现明显优于传统的无监督方法和其他现有研究的数据集。

The massive growth of network traffic data leads to a large volume of datasets. Labeling these datasets for identifying intrusion attacks is very laborious and error-prone. Furthermore, network traffic data have complex time-varying non-linear relationships. The existing state-of-the-art intrusion detection solutions use a combination of various supervised approaches along with fused features subsets based on correlations in traffic data. These solutions often require high computational cost, manual support in fine-tuning intrusion detection models, and labeling of data that limit real-time processing of network traffic. Unsupervised solutions do reduce computational complexities and manual support for labeling data but current unsupervised solutions do not consider spatio-temporal correlations in traffic data. To address this, we propose a unified Autoencoder based on combining multi-scale convolutional neural network and long short-term memory (MSCNN-LSTM-AE) for anomaly detection in network traffic. The model first employs Multiscale Convolutional Neural Network Autoencoder (MSCNN-AE) to analyze the spatial features of the dataset, and then latent space features learned from MSCNN-AE employs Long Short-Term Memory (LSTM) based Autoencoder Network to process the temporal features. Our model further employs two Isolation Forest algorithms as error correction mechanisms to detect false positives and false negatives to improve detection accuracy. %Additionally, covariance matrices forms a Riemannian manifold that is naturally embedded with distance metrices that facilitates descriminative patterns for detecting malicious network traffic. We evaluated our model NSL-KDD, UNSW-NB15, and CICDDoS2019 dataset and showed our proposed method significantly outperforms the conventional unsupervised methods and other existing studies on the dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题