Digital Media Forensics and Security

Segmented Backdoor Defense Based on Local Gradient and Global Gradient Ascent

Expand
  • College of Information and Cyber Security, People's Public Security University of China, Beijing 100038, China

Received date: 2022-10-28

  Online published: 2023-03-29

Abstract

Backdoor triggers tend to be hidden and are difficult to detect. To solve this problem, a segmented backdoor defense (SBD) method based on local and global gradient ascent is proposed. In the early stage of training, local gradient ascent is introduced to enlarge the difference between the average training loss of backdoor samples and clean samples. A small number of high-precision backdoor samples are isolated to facilitate backdoor forgetting in the later stage. In the backdoor forgetting stage, global gradient ascent is introduced to reduce the correlation between backdoor samples and target categories to achieve defense. Based on three benchmark datasets GTSRB, Cifar10 and MNIST, a large number of experiments are conducted on the WideResNet-16-1 model against six advanced backdoor attacks. It is shown that the proposed segmented backdoor defense method can reduce the success rate of most attacks to below 5%. Moreover, the proposed method can train a clean equivalent learning model on both backdoor dataset and clean dataset.

Cite this article

XIAO Xiaotong, DING Jianwei, ZHANG Qi . Segmented Backdoor Defense Based on Local Gradient and Global Gradient Ascent[J]. Journal of Applied Sciences, 2023 , 41(2) : 218 -227 . DOI: 10.3969/j.issn.0255-8297.2023.02.003

References

[1] Gu T Y, Liu K, Dolan-Gavitt B, et al. BadNets:evaluating backdooring attacks on deep neural networks[J]. IEEE Access, 2019, 7:47230-47244.
[2] Chen X Y, Liu C, Li B, et al. Targeted backdoor attacks on deep learning systems using data poisoning[DB/OL]. 2017[2022-10-28]. https://arxiv.org/abs/1712.05526.
[3] Turner A, Tsipras D, Madry A. Clean-label backdoor attacks[EB/OL]. https://people.csail.mit.edu/madry/lab/,2019.
[4] Zhao S H, Ma X J, Zheng X, et al. Clean-label backdoor attacks on video recognition models[DB/OL]. 2020[2022-10-28]. https://arxiv.org/abs/2003.03030.
[5] Zhu C, Huang W R, Shafahi A, et al. Transferable clean-label poisoning attacks on deep neural nets[DB/OL]. 2019[2022-10-28]. https://arxiv.org/abs/1905.05897.
[6] Tran B, Li J, Madry A. Spectral signatures in backdoor attacks[DB/OL]. 2018[2022-10-28]. https://arxiv.org/abs/1811.00636.
[7] Liu Y, Ma S, Aafer Y, et al. Trojaning attack on neural networks[C]//Network and Distributed System Security Symposium, 2017.
[8] Nguyen A, Tran A. Input-aware dynamic backdoor attack[DB/OL]. 2020[2022-10-28]. https://arxiv.org/abs/2010.08138.
[9] Chen X, Ma Y N, Lu S W. Use procedural noise to achieve backdoor attack[J]. IEEE Access, 2021, 9:127204-127216.
[10] Nguyen A, Tran A. WaNet -imperceptible warping-based backdoor attack[DB/OL]. 2021[2022-10-28]. https://arxiv.org/abs/2102.10369.
[11] Liu Y F, Ma X J, Bailey J, et al. Reflection backdoor:a natural backdoor attack on deep neural networks[C]//16th European Conference on Computer Vision, 2020:182-199.
[12] Barni M, Kallas K, Tondi B. A new backdoor attack in CNNS by training set corruption without label poisoning[C]//2019 IEEE International Conference on Image Processing (ICIP), 2019:101-105.
[13] Li S F, Xue M H, Zhao B Z H, et al. Invisible backdoor attacks on deep neural networks via steganography and regularization[DB/OL]. 2019[2022-10-28]. https://arxiv.org/abs/1909.02742.
[14] Zhang J, Chen D D, Huang Q D, et al. Poison ink:robust and invisible backdoor attack[J]. IEEE Transactions on Image Processing, 2022, 31:5691-5705.
[15] Li Y G, Lyu X X, Koren N, et al. Anti-backdoor learning:training clean models on poisoned data[DB/OL]. 2021[2022-10-28]. https://arxiv.org/abs/2110.11571.
[16] Li Y M, Wu B Y, Jiang Y, et al. Backdoor learning:a survey[DB/OL]. 2020[2022-10-28]. https://arxiv.org/abs/2007.08745.
[17] Diakonikolas I, Kamath G, Kane D M, et al. Sever:a robust meta-algorithm for stochastic optimization:10.48550[P]. 2018-03-07.
[18] Gao C, Yao Y, Zhu W Z. Generative adversarial nets for robust scatter estimation:a proper scoring rule perspective[J]. Journal of Machine Learning Research, 2020:21(160):1-48.
[19] Koh P W, Liang P. Understanding black-box predictions via influence functions[DB/OL]. 2017[2022-10-28]. https://arxiv.org/abs/1703.04730.
[20] Ma S Q, Liu Y Q, Tao G H, et al. NIC:detecting adversarial samples with neural network invariant checking[C]//Network and Distributed System Security Symposium, 2019.
[21] Borgnia E, Cherepanova V, Fowl L, et al. Strong data augmentation sanitizes poisoning and backdoor attacks without an accuracy tradeoff[DB/OL]. 2020[2022-10-28]. https://arxiv.org/abs/2011.09527.
[22] Rosenfeld E, Winston E, Ravikumar P, et al. Certified robustness to label-flipping attacks via randomized smoothing[DB/OL]. 2020[2022-10-28]. https://arxiv.org/abs/2002.03018.
[23] Wang B L, Yao Y S, Shan S, et al. Neural cleanse:identifying and mitigating backdoor attacks in neural networks[C]//IEEE Symposium on Security and Privacy, 2019.
[24] Qiao X M, Yang Y K, Li H. Defending neural backdoors via generative distribution modeling[C]//Neural Information Processing Systems, 2019.
[25] Liu K, Dolan-Gavitt B, Garg S. Fine-pruning:defending against backdooring attacks on deep neural networks[C]//International Symposium on Research in Attacks, 2018.
[26] Li Y M, Zhai T Q, Wu B Y, et al. Rethinking the trigger of backdoor attack[DB/OL]. 2020[2022-10-28]. https://arxiv.org/abs/2004.04692.
[27] Zhao P, Chen P Y, Das P, et al. Bridging mode connectivity in loss landscapes and adversarial robustness[DB/OL]. 2020[2022-10-28]. https://arxiv.org/abs/2005.00060.
[28] Li Y G, Koren N, Lyu L, et al. Neural attention distillation:erasing backdoor triggers from deep neural networks[DB/OL]. 2021[2022-10-28]. https://arxiv.org/abs/2101.05930.
[29] Wu D X, Wang Y S. Adversarial neuron pruning purifies backdoored deep models[DB/OL]. 2021[2022-10-28]. https://arxiv.org/abs/2110.14430.
[30] Li Y Z, Li Y M, Wu B Y, et al. Invisible backdoor attack with sample-specific triggers[C]//2021 IEEE International Conference on Computer Vision (ICCV), 2022:16443-16452.
[31] Gu T Y, Dolan-Gavitt B, Garg S. BadNets:identifying vulnerabilities in the machine learning model supply chain[DB/OL]. 2017[2022-10-28]. https://arxiv.org/abs/1708.06733.
[32] Stallkamp J, Schlipsing M, Salmen J, et al. Man vs. computer:benchmarking machine learning algorithms for traffic sign recognition[J]. Neural Networks, 2012, 32:323-332.
[33] Zagoruyko S, Komodakis N. Wide residual networks[DB/OL]. 2016[2022-10-28]. https://arxiv.org/abs/1605.07146.
[34] Krizhevsky A. Learning multiple layers of features from tiny images[J]. Handbook of Systemic Autoimmune Diseases, 2009, 1(4):18268744.
Outlines

/