基于LSTM和特征生成的网络流量分类

王帅, 董育宁, 李涛

doi:10.3969/j.issn.0255-8297.2022.05.005

应用科学学报 >

2022 , Vol. 40 >Issue 5: 758 - 769

DOI: https://doi.org/10.3969/j.issn.0255-8297.2022.05.005

通信工程

基于LSTM和特征生成的网络流量分类

展开

南京邮电大学通信与信息工程学院, 江苏南京 210003

收稿日期: 2020-11-24

网络出版日期: 2022-09-30

基金资助

国家自然科学基金（No.61271233）资助

收起

Network Traffic Classification Based on LSTM and Feature Generation

Expand

College of Communication and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, Jiangsu, China

Received date: 2020-11-24

Online published: 2022-09-30

Fold

摘要

本文提出了一种将特征生成和长短期记忆（long short term memory，LSTM）模型相结合的网络流量分类方法。该方法采用矩阵乘法特征生成方式，分析对比了不同特征生成方法的分类性能。通过实验比较了原数据和特征数据在分类问题上的准确性，并比较了卷积神经网络（convolutional neural network，CNN）和本文方法用于网络流分类的效果。在统计特征时采用核函数，使其可以适应LSTM输入维度，获得更佳的分类效果。对真实网络流数据的实验结果表明，本文方法在细分类中的准确度可达93.9%，而在粗分类任务中可达99.2%，其性能明显优于现有其他分类方法。

关键词： 流量分类; 特征生成; 长短期记忆; 细分类

本文引用格式

王帅, 董育宁, 李涛 . 基于LSTM和特征生成的网络流量分类[J]. 应用科学学报, 2022 , 40(5) : 758 -769 . DOI: 10.3969/j.issn.0255-8297.2022.05.005

Abstract

This paper proposes a network traffic classification method that combines feature generation and long short term memory (LSTM) model. This method analyzes and compares the classification performances of different feature generation methods using matrix multiplication feature generation method. The accuracy of original data and feature data on the classification problem is tested experimentally, and the results of convolutional neural network (CNN) and the proposed method are compared on network flow classification. The kernel function is used in the statistical feature, so that it can adapt to the LSTM input dimension and obtain better classification results. Experimental results on real network flow data show that the proposed method can achieve 93.9% accuracy in classification, and 99.2% in coarse grained classification task, and this performance is significantly better than that of existing methods.

Key words： traffic classification; feature generation; long short term memory (LSTM); fine classification

参考文献

[1] Li R P, Zhao Z F, Zheng J C, et al. The learning and prediction of application-level traffic data in cellular networks[J]. IEEE Transactions on Wireless Communications, 2017, 16(6):3899-3912.
[2] Anderson B, Mcgrew D. Machine learning for encrypted malware traffic classification:accounting for noisy labels and non-stationarity[C]//201923rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017:1723-1732.
[3] Shi Y, Biswas S. A deep-learning enabled traffic analysis engine for video source identification[C]//201911th International Conference on Communication systems and networks (COMSNETS), 2019:15-21.
[4] Wang W, Zhu M, Wang J L, et al. End-to-end encrypted traffic classification with onedimensional convolution neural networks[C]//2017 IEEE International Conference on Intelligence and Security Informatics, 2017:43-48.
[5] Wang X Z, Mei X Y, Huang Q H, et al. Fine-grained learning performance prediction via adaptive sparse self-attention networks[J]. Information Sciences, 2021, 545(4):223-240.
[6] Vu L, Thuy H V, Nguyen Q U, et al. Time series analysis for encrypted traffic classification:a deep learning approach[C]//201818th International Symposium on Communications and Information Technologies (ISCIT), 2018:121-126.
[7] Gu J, Wang Z, Kuen J, et al. Recent advances in convolutional neural networks[J]. Pattern Recognition, 2018, 77:354-377.
[8] Kim Y. Convolutional neural networks for sentence classification[C]//2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014:1746-1751.
[9] Zhang X L, Han P, Xu L, et al. Research on bearing fault diagnosis of wind turbine gearbox based on 1DCNN-PSO-SVM[J]. IEEE Access, 2020, 8:192248-192258.
[10] He J Y, Lee J, Song T T, et al. Recurrent neural network (RNN) for delay-tolerant repetitioncoded (RC) indoor optical wireless communication systems[J]. Optics Letters, 2019, 44(15):3745.
[11] Velan P, Čermák M, Čeleda P, et al. A survey of methods for encrypted traffic classification and analysis[J]. International Journal of Network Management, 2015, 25(5):355-374.
[12] 杨凌云, 董育宁, 王再见, 等. 基于M值概率分布的网络视频流分类[J]. 电子与信息学报, 2018, 40(5):1094-1100. Yang L Y, Dong Y N, Wang Z J, et al. Network video traffic classification based on probability distribution of M value[J]. Journal of Electronics & Information Technology, 2018, 40(5):1094-1100. (in Chinese)
[13] Ma R L, Qin S J. Identification of unknown protocol traffic based on deep learning[C]//20173rd IEEE International Conference on Computer and Communications (ICCC) IEEE, 2017:1195-1198.
[14] Wang W, Zhu M, Zeng X W, et al. Malware traffic classification using convolutional neural network for representation learning[C]//2017 International Conference on Information Networking (ICOIN), 2017:712-717.
[15] 孔俊. 基于双层特征融合的生物识别[J]. 北华大学学报(自然科学版), 2020, 21(1):110-117. Kong J. Biometric identification based on two-layer feature fusion[J]. Journal of Beihua University (Natural Science), 2020, 21(1):110-117. (in Chinese)
[16] Dainotti A, Pescape A, Claffy K C. Issues and future directions in traffic classification[J]. IEEE Network, 2012, 26(1):35-40.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献