多层记忆增强生成对抗网络二次预测的视频异常检测方法

doi:10.3969/j.issn.0255-8297.2023.01.007

摘要/Abstract

摘要： 为了提高视频异常检测的准确率，提出了一种基于多层记忆增强生成对抗网络二次预测的视频异常检测方法。首先利用目标检测提取时空立方体，并将其输入自编码器中得到预测帧；其次将预测帧的表观特征和对应真实帧的光流特征进行融合，形成融合特征；最后利用多层记忆增强生成对抗网络二次预测未来帧，以便学习不同层次特征的正常模式并捕获上下文的语义信息。在UCSD Ped2和CUHK Avenue数据集上进行的实验结果表明：所提出的方法与其他视频异常检测方法相比，可有效提高视频异常检测的性能，使帧级别AUC分别达到99.57%和91.59%。

关键词: 视频异常检测, 多层记忆增强, 生成对抗网络, 未来帧预测, 深度学习

Abstract: In order to improve the accuracy of video anomaly detection, we propose a video anomaly detection method based on secondary prediction of multi-layer memory enhancement generative adversarial networks. Firstly, a spatiotemporal cube is extracted from target detection, and sent into encoder to obtain a prediction frame. Secondly, the apparent feature of the prediction frame and the optical flow feature of corresponding real frames are fused to form fusion features. Finally, a secondary prediction future frame is generated by using multi-layer memory enhancement generative adversarial networks, for learning normal feature patterns of different levels and capturing the semantic information of context. Experimental results on UCSD Ped2 and CUHK Avenue datasets show that the proposed method can effectively improve the performance of video anomaly detection compared with other video anomaly detection methods, and its frame level AUC reaches 99.57% and 91.59%, respectively.

Key words: video anomaly detection, multi-layer memory enhancement, generative adversarial network, future frame prediction, deep learning

中图分类号:

TP391

曾静, 李莹, 戚小莎, 吉根林. 多层记忆增强生成对抗网络二次预测的视频异常检测方法[J]. 应用科学学报, 2023, 41(1): 80-94.

ZENG Jing, LI Ying, QI Xiaosha, JI Genlin. Video Anomaly Detection Method Based on Secondary Prediction of Multi-layer Memory Enhancement Generative Adversarial Network[J]. Journal of Applied Sciences, 2023, 41(1): 80-94.

参考文献

[1] 杨帆, 肖斌, 於志文. 监控视频的异常检测与建模综述[J]. 计算机研究与发展, 2021, 58(12):2708-2723. Yang F, Xiao B, Yu Z W. Anomaly detection and modeling of surveillance video[J]. Journal of Computer Research and Development, 2021, 58(12):2708-2723. (in Chinese)
[2] Liu W, Luo W X, Lian D Z, et al. Future frame prediction for anomaly detection-a new baseline[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018:6536-6545.
[3] Lee S, Kim H G, Ro Y M. BMAN:bidirectional multi-scale aggregation networks for abnormal event detection[J]. IEEE Transactions on Image Processing, 2020, 29:2395-2408.
[4] 冷佳旭, 谭明圮, 胡波, 等. 基于隐式视角转换的视频异常检测[J]. 计算机科学, 2022, 49(2):142-148. Leng J X, Tan M P, Hu B, et al. Video anomaly detection bases on implicit view transformation[J]. Computer Science, 2022, 49(2):142-148. (in Chinese)
[5] Sabokrou M, Fayyaz M, Fathy M, et al. Deep-cascade:cascading 3D deep neural networks for fast anomaly detection and localization in crowded scenes[J]. IEEE Transactions on Image Processing, 2017, 26(4):1992-2004.
[6] 江帆, 王贵锦, 林行刚, 等. 基于运动方向的异常行为检测[J]. 自动化学报, 2008, 34(11):1348-1357. Jiang F, Wang G J, Lin X G, et al. Anomaly detection based on motion direction[J]. Acta Automatica Sinica, 2008, 34(11):1348-1357. (in Chinese)
[7] 李欣璐, 吉根林, 赵斌. 基于卷积自编码器分块学习的视频异常事件检测与定位[J]. 数据采集与处理, 2021, 36(3):489-497. Li X L, Ji G L, Zhao B. Convolutional auto-encoder patch learning based video anomaly event detection and localization[J]. Journal of Data Acquisition and Processing, 2021, 36(3):489-497. (in Chinese)
[8] 魏明军, 周太宇, 纪占林, 等. 基于Mask-YOLO的复杂场景口罩佩戴检测[J]. 应用科学学报, 2022, 40(1):93-104. Wei M J, Zhou T Y, Ji Z L, et al. Mask wearing detection in complex scenes based on Mask-YOLO[J]. Journal of Applied Sciences, 2022, 40(1):93-104. (in Chinese)
[9] 汪鹏, 郑文凤, 史进, 等. 基于MFANet和上下文特征融合的遥感影像目标检测[J]. 应用科学学报, 2022, 40(1):131-144. Wang P, Zheng W F, Shi J, et al. Remote sensing image object detection based on MFANet and contextual features fusion[J]. Journal of Applied Sciences, 2022, 40(1):131-144. (in Chinese)
[10] Ionescu R T, Khan F S, Georgescu M I, et al. Object-centric auto-encoders and dummy anomalies for abnormal event detection in video[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, 2019:7842-7851.
[11] Yu G, Wang S Q, Cai Z P, et al. Cloze test helps:effective video anomaly detection via learning to complete video events[C]//Proceedings of the 28th ACM International Conference on Multimedia, Seattle, 2020:583-591.
[12] Cai Z W, Vasconcelos N. Cascade R-CNN:delving into high quality object detection[C]//Proceedings of 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018:6154-6162.
[13] Gong D, Liu L Q, Le V, et al. Memorizing normality to detect anomaly:memory-augmented deep autoencoder for unsupervised anomaly detection[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision, Seoul, 2019:1705-1714.
[14] Park H, Noh J, Ham B. Learning memory-guided normality for anomaly detection[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 2020:14360-14369.
[15] Ilg E, Mayer N, Saikia T, et al. FlowNet 2.0:evolution of optical flow estimation with deep networks[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017:1647-1655.
[16] Hasan M, Choi J, Neumann J, et al. Learning temporal regularity in video sequences[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016:733-742.
[17] Zhao Y R, Deng B, Shen C, et al. Spatio-temporal autoencoder for video anomaly detection[C]//Proceedings of 2017 ACM on Multimedia Conference, Mountain View, 2017:1933-1941.
[18] Tang Y, Zhao L, Zhang S S, et al. Integrating prediction and reconstruction for anomaly detection[J]. Pattern Recognition Letters, 2020, 129:123-130.
[19] Nguyen T N, Meunier J. Anomaly detection in video sequence with appearance-motion correspondence[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision, Seoul, 2019:1273-1283.
[20] Ye M C, Peng X J, Gan W H, et al. AnoPCN:video anomaly detection via deep predictive coding network[C]//Proceedings of the 27th ACM International Conference on Multimedia, 2019:1805-1813.
[21] Luo W X, Liu W, Lian D Z, et al. Video anomaly detection with sparse coding inspired deep neural networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(3):1070-1084.
[22] Cai R C, Zhang H, Liu W, et al. Appearance-motion memory consistency network for video anomaly detection[C]//Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, Virtual, 2021:938-946.
[23] Lv H, Chen C, Cui Z, et al. Learning normal dynamics in videos with meta prototype network[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Virtual, 2021:15425-15434.
[24] Fan Y X, Wen G J, Li D, et al. Video anomaly detection and localizati6on via Gaussian mixture fully convolutional variational autoencoder[J]. Computer Vision and Image Understanding, 2020, 195:102920.
[25] Li N J, Chang F L, Liu C S. Spatial-temporal cascade autoencoder for video anomaly detection in crowded scenes[J]. IEEE Transactions on Multimedia, 2021, 23:203-215.
[26] Doshi K, Yilmaz Y. Continual learning for anomaly detection in surveillance videos[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 2020:1025-1034.