Video Anomaly Detection Method Based on Multi-scale Feature Fusion and Attention Mechanism

WU Xiang, XIAO Jian, JI Genlin

doi:10.3969/j.issn.0255-8297.2025.02.004

Journal of Applied Sciences >

2025 , Vol. 43 >Issue 2: 234 - 244

DOI: https://doi.org/10.3969/j.issn.0255-8297.2025.02.004

Communication Engineering

Video Anomaly Detection Method Based on Multi-scale Feature Fusion and Attention Mechanism

Expand

School of Computer and Electronic Information/School of Artificial Intelligence, Nanjing Normal University, Nanjing 210023, Jiangsu, China

Received date: 2023-07-19

Online published: 2025-04-03

Fold

Abstract

Motion objects in video frames often exhibit diverse scales over time, which poses a challenge for video anomaly detection. Although traditional generative adversarial networks (GANs) have achieved some success in video anomaly detection tasks, their performance is limited due to the use of a single-scale feature extraction that fails to capture features of objects at different scales. To address this issue, this paper proposes a video anomaly detection method based on a GAN structure that incorporates multi-scale feature fusion and attention mechanisms. Specifically, different-sized convolutional kernels are employed to capture features with varying receptive fields, which are then fused to obtain multi-scale feature representations. Additionally, a coordinate attention mechanism is introduced after the transposed convolutional layers of the generator, allowing adaptive allocation of feature map weights to enhance the model’s perception of crucial features.Experimental results on the public datasets UCSD Ped2 and Avenue demonstrate that the proposed method outperforms existing approaches.

Key words： video anomaly detection; deep learning; generative adversarial networks; multi-scale feature fusion; attention mechanism

Cite this article

WU Xiang, XIAO Jian, JI Genlin . Video Anomaly Detection Method Based on Multi-scale Feature Fusion and Attention Mechanism[J]. Journal of Applied Sciences, 2025 , 43(2) : 234 -244 . DOI: 10.3969/j.issn.0255-8297.2025.02.004

References

[1] 胡海洋, 张力, 李忠金. 融合自编码器和one-class SVM的异常事件检测[J]. 中国图象图形学报, 2020, 25(12): 2614-2629. Hu H Y, Zhang L, Li Z J. Anomaly detection with autoencoder and one-class SVM [J]. Journal of Image and Graphics, 25(12): 2614-2629. (in Chinese)
[2] Hasan M, Choi J, Neumann J, et al. Learning temporal regularity in video sequences [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2016: 733-742.
[3] Gong D, Liu L, Le V, et al. Memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection [C]//IEEE/CVF International Conference on Computer Vision, 2019: 1705-1714.
[4] Liu W, Luo W, Lian D, et al. Future frame prediction for anomaly detection-a new baseline [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2018: 6536-6545.
[5] Dong F, Zhang Y, Nie X. Dual discriminator generative adversarial network for video anomaly detection [J]. IEEE Access, 2020, 8: 88170-88176.
[6] Duta I C, Liu L, Zhu F, et al. Pyramidal convolution: rethinking convolutional neural networks for visual recognition [DB/OL]. 2006[2023-07-19]. http://arxiv.org/abs/11538, 2020.
[7] Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2015: 1-9.
[8] He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.
[9] Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2117-2125.
[10] Isola P, Zhu J Y, Zhou T, et al. Image-to-image translation with conditional adversarial networks [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1125-1134.
[11] Zaheer M Z, Lee J, Astrid M, et al. Old is gold: redefining the adversarially learned oneclass classifier training paradigm [C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 14183-14193.
[12] Guo M H, Xu T X, Liu J J, et al. Attention mechanisms in computer vision: a survey [J]. Computational Visual Media, 2022, 8(3): 331-368.
[13] Zhang J, Qi X, Ji G. Self attention based bi-directional long short-term memory auto encoder for video anomaly detection [C]//2021 Ninth International Conference on Advanced Cloud and Big Data (CBD). IEEE, 2022: 107-112.
[14] Wang J, Zhang J, Ji G, et al. Criss-cross attention based auto encoder for video anomaly event detection [J]. Intelligent Automation and Soft Computing, 2022, 34(3): 1629-1642.
[15] Gu J, Zeng J, Ji G. Dual attention mechanisms based auto-encoder for video anomaly detection [C]//Artificial Intelligence and Security, 2022: 153-165.
[16] Hou Q, Zhou D, Feng J. Coordinate attention for efficient mobile network design [C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 13713-13722.
[17] Ning Z, Li Z, Song L. Multi-scale spatial-temporal interaction network for video anomaly detection [DB/OL]. 2023[2023-07-19]. http://arxiv.org/abs/2306.10239.
[18] Huang X, Zhao C, Gao C, et al. Synthetic pseudo anomalies for unsupervised video anomaly detection: a simple yet efficient framework based on masked autoencoder [C]//ICASSP 2023- 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023: 1-5.
[19] Lu Y, Yu F, Reddy M K K, et al. Few-shot scene-adaptive anomaly detection [C]//16th European Conference on Computer Vision, 2020: 125-141.
[20] Yang Y, Zhan D, Yang F, et al. Improving video anomaly detection performance with patchlevel loss and segmentation map [C]//2020 IEEE 6th International Conference on Computer and Communications (ICCC). IEEE, 2020: 1832-1839.
[21] Wu P, Liu J, Li M, et al. Fast sparse coding networks for anomaly detection in videos [J]. Pattern Recognition, 2020, 107: 107515.
[22] Tang Y, Zhao L, Zhang S, et al. Integrating prediction and reconstruction for anomaly detection [J]. Pattern Recognition Letters, 2020, 129: 123-130.
[23] Astrid M, Zaheer M Z, Lee J Y, et al. Learning not to reconstruct anomalies [DB/OL]. 2021[2023-07-19]. http://arxiv.org/abs/2110.09742.
[24] Xu J, Miao Z, Xu W, et al. Video anomaly detection using dual discriminator based generative adversarial network [C]//202120th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 2021: 1259-1265.
[25] Park C, Cho M A, Lee M, et al. FastAno: fast anomaly detection via spatio-temporal patch transformation [C]//IEEE/CVF Winter Conference on Applications of Computer Vision, 2022: 2249-2259.
[26] Nawaratne R, Alahakoon D, De Silva D, et al. Spatiotemporal anomaly detection using deep learning for real-time video surveillance [J]. IEEE Transactions on Industrial Informatics, 2019, 16(1): 393-402.

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

References