基于快速特征欺骗的通用扰动生成改进方法

韦健杰, 吕东辉, 陆小锋, 孙广玲

doi:10.3969/j.issn.0255-8297.2020.06.015

应用科学学报 >

2020 , Vol. 38 >Issue 6: 986 - 994

DOI: https://doi.org/10.3969/j.issn.0255-8297.2020.06.015

信号与信息处理

基于快速特征欺骗的通用扰动生成改进方法

展开

上海大学通信与信息工程学院, 上海 200444

收稿日期: 2020-03-10

网络出版日期: 2020-12-08

基金资助

国家自然科学基金（No.U1636206）资助

收起

Improved Method to Craft Universal Perturbations Based on Fast Feature Fool

Expand

School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China

Received date: 2020-03-10

Online published: 2020-12-08

Fold

摘要

近年来，基于深度神经网络的应用日益广泛，然而深度神经网络容易受到由输入数据设计的微小扰动而带来的对抗性攻击，导致网络的错误输出，给智能系统的部署带来安全隐患.为了提高智能系统的抗风险能力，有必要对存在风险的扰动生成方法展开研究.快速特征欺骗（fast feature fool，FFF）是面向视觉任务的一种有效的通用扰动生成方法.考虑了输入图像在网络中的实际激活状态，以最大化原始图像和对抗样本之间的特征差异作为生成扰动的目标函数；同时考虑不同卷积层对于生成扰动的不同影响，在生成扰动的目标函数中，对不同卷积层对应的项加以不同权重.实验结果表明，改进的FFF方法攻击成功率更高，同时也具备更强的跨模型攻击能力.

关键词： 深度神经网络; 通用扰动; 快速特征欺骗; 特征差异

本文引用格式

韦健杰, 吕东辉, 陆小锋, 孙广玲 . 基于快速特征欺骗的通用扰动生成改进方法[J]. 应用科学学报, 2020 , 38(6) : 986 -994 . DOI: 10.3969/j.issn.0255-8297.2020.06.015

Abstract

Although deep neural networks have been widely applied in recent years, they are readily fooled by adversarial input perturbations which are imperceptible to humans. Such vulnerability to adversarial attacks has imposed threats for system deployment in security-crucial setting, thus it is necessary to study the risky generation method of perturbations to boost the anti-risk capability. As a universal perturbation, fast feature fool (FFF) is an effective attacking method for visual tasks. Beyond solely mixing the convolutional layer's output irrespective of the input activation status, this paper improves the FFF method by maximizing the feature difference between the input image and corresponding adversarial image during which the contributions of multiple convolutional layers are weighted differently. Experimental results demonstrate that the improved FFF actually has obtained higher success attacking rate and stronger cross-model transfer ability than the original one.

Key words： deep neural networks; universal perturbations; fast feature fool (FFF); feature difference

参考文献

[1] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6):84-90.
[2] Ren S Q, He K M, Girshick R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[3] Sutskever I, Vinyals O, Le V. Sequence to sequence learning with neural networks[C]//Advances in Neural Information Processing Systems, Montreal, Canada, 2014:3104-3112.
[4] Szegedy C, Zaremba W, SutskeveR I, et al. Intriguing properties of neural networks[C]//International Conference on Learning Representations, Banff, Canada, 2014:64-70.
[5] 张思思, 左信, 刘建伟. 深度学习中的对抗样本问题[J]. 计算机学报, 2018, 41(8):1886-1904. Zhang S S, Zuo X, Liu J W. The problem of the adversarial examples in deep learning[J]. Chinese Journal of Computers, 2018, 41(8):1886-1904. (in Chinese)
[6] Mahendran A, Vedaldi A. Understanding deep image representations by inverting them[C]//IEEE Conference on Computer Vision and Pattern Recognition Boston, USA, 2015:188-5196.
[7] Goodfellow I, Shlens J, Szegedy C. Explaining and harnessing adversarial examples[J/OL].[2014-12-20]. https://arxiv.org/abs/1412.6572.
[8] Kurakin A, Goodfellow I, Bengio S. Adversarial examples in the physical world[J/OL].[2016-07-08]. https://arxiv.org/abs/1607.02533.
[9] Carlini N, Wagner D. Towards evaluating the robustness of neural networks[C]//IEEE Symposium on Security and Privacy, San Jose, USA, 2017:39-57.
[10] Moosavi-Dezfooli S M, Fawzi A, Frossard P. DeepFool:a simple and accurate method to fool deep neural networks[C]//IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016:2574-2582.
[11] Moosavi-Dezfooli S M, Fawzi A, Fawzi O, et al. Universal adversarial perturbations[C]//IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017:86-94.
[12] Mopuri K R, Garg U, Babu V. Fast feature fool:a data independent approach to universal adversarial perturbations[J/OL].[2017-07-18]. https://arxiv.org/abs/1707.05572.
[13] Mopuri K R, Ganeshan A, Babu R. Generalizable data-free objective for crafting universal adversarial perturbations[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(10):2452-2465.
[14] Ross A S, Doshivelez F. Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients[C]//AAAI Conference on Artificial Intelligence, New Orleans, USA, 2018:1660-1669.
[15] Miyato T, Maeda S, Koyama M, et al. Distributional smoothing with virtual adversarial training[J/OL].[2016-06-11]. https://arxiv.org/abs/1507.00677.
[16] Song C, Cheng H P, Wu C. A multi-strength adversarial training method to mitigate adversarial attacks[C]//IEEE Computer Society Annual Symposium on VLSI, Hong Kong, China, 2018:476-481.
[17] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network[J/OL].[2015-03-09]. https://arxiv.org/abs/1503.02531.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献