针对大多数加密流量分类(encrypted traffic classification,ETC)模型由于标签数据稀缺而导致的性能下降问题,提出了一个基于对比学习的半监督加密流量分类(semisupervised encrypted traffic classification based on contrastive learning,SSETC-CL)模型。通过比较样本之间的相似性和差异性,SSETC-CL模型能够从大量无标注数据中学习到有用的表示,从而获得一个通用且优秀的特征编码网络,降低了下游任务对标签数据的依赖。本文在公有数据集ISCXVPN2016以及两个自采数据集上对SSETC-CL模型进行了评估。与其他基准模型相比,SSETC-CL模型在设定任务上的表现最佳,准确率最大提升了8.92%。实验结果表明,SSETC-CL模型不仅在预训练模型已知的流量上具有较高的精度,而且具备将预训练模型所获得的知识应用于未知流量的迁移能力。
To address the performance degradation of most encrypted traffic classification (ETC) models due to scarce labeled data, this paper proposes a semi-supervised encrypted traffic classification model based on contrastive learning (SSETC-CL). By comparing the similarities and differences between samples, SSETC-CL is capable of learning useful representations from large amounts of unlabeled data, thereby obtaining a versatile and effective feature encoding network, and reducing dependence on labeled data for downstream tasks. The performance of SSETC-CL is evaluated on the public dataset ISCXVPN2016 as well as two self-collected datasets. Compared to other baseline models, SSETC-CL achieved a maximum accuracy improvement of 8.92% on the specified task, showing its superior performance. Experimental results clearly demonstrate that SSETC-CL not only achieves high accuracy on traffic seen during pretraining but also exhibits the ability to transfer the knowledge gained from pretraining to unknown traffic.
[1] Abbasi M, Shahraki A, Taherkordi A. Deep learning for network traffic monitoring and analysis (NTMA): a survey [J]. Computer Communications, 2021, 170: 19-41.
[2] Shen M, Ye K, Liu X T, et al. Machine learning-powered encrypted network traffic analysis: a comprehensive survey [J]. IEEE Communications Surveys and Tutorials, 2022, 25(1): 791-824.
[3] Huo Y H, Song C X, Zhou M C, et al. A novel approach for semi-supervised network traffic classification [C]//14th International Conference on Advanced Infocomm Technology (ICAIT), 2022: 64-69.
[4] Ding Y, Zhu G Q, Chen D J, et al. Adversarial sample attack and defense method for encrypted traffic data [J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(10): 18024-18039.
[5] Wang W, Zhu M, Wang J L, et al. End-to-end encrypted traffic classification with onedimensional convolution neural networks [C]//2017 IEEE International Conference on Intelligence and Security Informatics (ISI), 2017: 43-48.
[6] Liu C, He L T, Xiong G, et al. Fs-Net: a how sequence network for encrypted traffic classification [C]//IEEE Conference on Computer Communications, 2019: 1171-1179.
[7] Wang X, Chen S H, Su J S. App-Net: a hybrid neural network for encrypted mobile traffic classification [C]//39th IEEE International Conference on Computer Communications, 2020: 424-429.
[8] Yang C, Xiong G, Zhang Q, et al. Few-shot encrypted traffic classification via multi-task representation enhanced meta-learning [J]. Computer Networks, 2023, 228: 109731.
[9] Towhid M S, Shahriar N. Encrypted network traffic classification using self-supervised learning [C]//8th IEEE International Conference on Network Softwarization (NetSoft)-Network Softwarization Coming of Age-New Challenges and Opportunities, 2022: 366-374.
[10] Aouedi O, Piamrat K, Bagadthey D. A semi-supervised stacked autoencoder approach for network traffic classification [C]//28th IEEE International Conference on Network Protocols, 2020: 1-6.
[11] Iliyasu A S, Deng H F. Semi-supervised encrypted traffic classification with deep convolutional generative adversarial networks [J]. IEEE Access, 2019, 8: 118-126.
[12] Chen T, Kornblith S, Norouzi M, et al. A simple framework for contrastive learning of visual representations [C]//International Conference on Machine Learning, 2020: 1597-1607.
[13] Madry A, Makelov A, Schmidt L, et al. Towards deep learning models resistant to adversarial attacks [DB/OL]. (2017-06-19) [2023-07-04]. https://arxiv.org/abs/1706.06083.
[14] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need [DB/OL]. (2017-06-12) [2023-07-04]. https://arxiv.org/abs/1706.03762.
[15] Draper-Gil G, Lashkari A H, Mamun M S I, et al. Characterization of encrypted and VPN traffic using time-related [C]//2nd International Conference on Information Systems Security and Privacy (ICISSP), 2016: 407-414.
[16] Wang W, Zhu M, Zeng X W, et al. Malware traffic classification using convolutional neural network for representation learning [C]//31st International Conference on Information Networking, 2017: 712-717.
[17] 王一丰, 郭渊博, 陈庆礼, 等. 基于对比增量学习的细粒度恶意流量分类方法[J]. 通信学报, 2023, 44(3): 1-11. Wang Y F, Guo Y B, Chen Q L, et al. Method based on contrastive incremental learning for fine-grained malicious traffic classification [J]. Journal on Communications, 2023, 44(3): 1-11. (in Chinese)
[18] Goodfellow I J, Shlens J, Szegedy C. Explaining and harnessing adversarial examples [DB/OL]. (2014-12-20) [2023-07-04]. https://arxiv.org/abs/1412.6572.
[19] Zhao Z Y, Guo Y Y, Wang J H, et al. CL-ETC: a contrastive learning method for encrypted traffic classification [C]//IFIP Networking Conference (IFIP Networking), 2022: 1-9.
[20] Yao H P, Liu C, Zhang P Y, et al. Identification of encrypted traffic through attention mechanism based long short term memory [J]. IEEE Transactions on Big Data, 2022, 8(1): 241-252.