5种流行假脸视频检测网络性能分析和比较

高逸飞, 胡永健, 余泽琼, 林育仪, 刘琲贝

doi:10.3969/j.issn.0255-8297.2019.05.002

应用科学学报 >

2019 , Vol. 37 >Issue 5: 590 - 608

DOI: https://doi.org/10.3969/j.issn.0255-8297.2019.05.002

多媒体信息安全

5种流行假脸视频检测网络性能分析和比较

展开

1. 华南理工大学电子与信息学院, 广州 510641;
2. 中新国际联合研究院, 广州 511356

收稿日期: 2019-07-27

修回日期: 2019-07-31

网络出版日期: 2019-10-18

基金资助

广东省科技计划国际协同创新项目（No.2017A050501002）；广州市开发区国际合作项目（No.2017GH22）；中新国际联合研究院项目（No.206-A017023，No.206-A018001）；广东省自然科学基金博士科研启动项目（No.2017A030310320）资助

收起

Evaluation and Comparison of Five Popular Fake Face Detection Networks

Expand

1. School of Electronic and Information Engineering, South China University of Technology, Guangzhou 510641, China;
2. Sino-Singapore International Joint Research Institute, Guangzhou 511356, China

Received date: 2019-07-27

Revised date: 2019-07-31

Online published: 2019-10-18

Fold

摘要

为对抗假脸视频的危害，研究者目前已经提出了多种不同的基于卷积神经网络（convolutional neural networks，CNN）的假脸视频检测器，然而这些检测器所存在的一个共同问题是库内检测通常能达到较高的准确率，但跨库检测时性能出现严重下降，即存在严重的泛化能力不足问题.该文对基于MesoInception-4、MISLnet、ShallowNetV1、Inceptionv3、Xception这5种流行网络的假脸视频检测器，在现有3个假脸视频库上进行库内和跨库测试，重点分析数据库的划分方式、数据增广操作以及检测阈值选取这3个因素对假脸视频检测器泛化能力的影响.

关键词： 假脸视频检测; 深度网络; 泛化能力; 数据库划分; 数据增广; 阈值选取

本文引用格式

高逸飞, 胡永健, 余泽琼, 林育仪, 刘琲贝 . 5种流行假脸视频检测网络性能分析和比较[J]. 应用科学学报, 2019 , 37(5) : 590 -608 . DOI: 10.3969/j.issn.0255-8297.2019.05.002

Abstract

Several fake face detectors based on convolutional neural network (CNN) have been reported to resist the impact of fake faces, but they all face a common problem that the intra-dataset test is generally with high accuracy, but the performance of crossdataset test drops significantly, which indicates low generalization ability. Based on thorough evaluations for five popular fake face detectors including MesoInception-4, MISLnet, ShallowNetV1, Inception-v3 and Xception, this paper completes both intra-dataset test and cross-dataset test on three fake face datasets. In experiment, the effects on generalization ability from of factors, such as dataset partition, data augmentation and threshold selection, are investigated.

Key words： fake face video detection; deep neural network; generalization; dataset partition; data augmentation; threshold selection

参考文献

[1] Blanz V, Scherbaum K, Vetter T, et al. Exchanging faces in images[J]. Computer Graphics Forum, 2004, 23(3):669-676.
[2] Agarwala A, Dontcheva M, Agrawala M, et al. Interactive digital photomontage[J]. ACM Transactions on Graphics, 2004, 23(3):294-302.
[3] Bitouk D, Kumar N, Dhillon S, et al. Face swapping:automatically replacing faces in photographs[J]. ACM Transactions on Graphics, 2008, 27(3):9:1-9:8.
[4] Williams L. Performance-driven facial animation[C]//ACM SIGGRAPH Computer Graphics. ACM, 1990, 24(4):235-242.
[5] Deng Z, Neumann U. Data-driven 3D facial animation[J]. Data Drivend Facial Animation, 2008:1-28.
[6] Ma W C, Jones A, Chiang J Y, et al. Facial performance synthesis using deformation-driven polynomial displacement maps[J]. ACM Transactions on Graphics, 2008, 27(5):121:1-121:10.
[7] Li H, Adams B, Guibas L J, et al. Robust single-view geometry and motion reconstruction[J]. ACM Transactions on Graphics, 2009, 28(5):175:1-175:10.
[8] Bradley D, Heidrich W, Popa T, et al. High resolution passive facial performance capture[J]. ACM Transactions on Graphics, 2010, 29(4):41:1-41:10.
[9] Beeler T, Hahn F, Bradley D, et al. High-quality passive facial performance capture using anchor frames[J]. ACM Transactions on Graphics, 2011, 30(4):75:1-75:10.
[10] Borshukov G, Piponi D, Larsen O, et al. Universal capture-image-based facial animation for the matrix reloaded[C]//ACM Siggraph 2005 Courses. ACM, 2005:16:1.
[11] Alexander O, Rogers M, Lambeth W, et al. The Digital Emily project:photoreal facial modeling and animation[C]//Acmsiggraph 2009 Courses. ACM, 2009:12:1-12:15.
[12] Vlasic D, Brand M, Pfister H, et al. Face transfer with multilinear models[J]. ACM Transactions on Graphics, 2005, 24(3):426-433.
[13] Dale K, Sunkavalli K, Johnson M K, et al. Video face replacement[J]. ACM Transactions on Graphics, 2011, 30(6):130:1-130:10.
[14] THIES J, Zollhöfer M, Nießner M, et al. Real-time expression transfer for facial reenactment[J]. ACM Transactions on Graphics, 2015, 34(6):183:1-183:14.
[15] Thies J, Zollhöfer M, Stamminger M, et al. Face2Face:real-time face capture and reenactment of RGB videos[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016:2387-2395.
[16] Antipov G, Baccouche M, Dugelay J L. Face aging with conditional generative adversarial networks[C]//2017 IEEE International Conference on Image Processing (ICIP). IEEE, 2017:2089-2093.
[17] Tewari A, ZollhÖFer M, Kim H, et al. Mofa:Model-based deep convolutional face autoencoder for unsupervised monocular reconstruction[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017:1274-1283.
[18] Nirkin Y, Masi I, Tuan A T, et al. On face segmentation, face swapping, and face perception[C]//2018 13th IEEE International Conference on Automatic Face and Gesture Recognition. IEEE, 2018:98-105.
[19] Kim H, Carrido P, Tewari A, et al. Deep video portraits[J]. ACM Transactions on Graphics, 2018, 37(4):163:1-163:14.
[20] Rössler a, Cozzolino D, Verdoliva L, et al. FaceForensics++:learning to detect manipulated facial images[DB/OL]. 2019[2019-01-25]. arXiv:1901.08971.
[21] Korshunov P, Marcel S. DeepFakes:a new threat to face recognition? assessment and detection[DB/OL]. 2018[2018-12-20]. arXiv:1812.08685.
[22] Khodabakhsh A, Ramachandra R, Raja K, et al. Fake face detection methods:can they be generalized?[C]//2018 International Conference of the Biometrics Special Interest Group (BIOSIG). IEEE, 2018:1-6.
[23] Afchar D, Nozick V, Yamagishi J, et al. Mesonet:a compact facial video forgery detection network[C]//IEEE International Workshop on Information Forensics and Security (WIFS). IEEE, 2018:1-7.
[24] Bayar B, Stamm M C. Constrained convolutional neural networks:a new approach towards general purpose image manipulation detection[J]. IEEE Transactions on Information Forensics and Security, 2018, 13(11):2691-2706.
[25] Tariq S, Lee S, Kim H, et al. Detecting both machine and human created fake face images in the wild[C]//The 2nd International Workshop on Multimedia Privacy and Security. ACM, 2018:81-87.
[26] Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision[C]//IEEE Conference on Computer Vision and Pattern Recognition. 2016:2818-2826.
[27] Chollet F. Xception:deep learning with depthwise separable convolutions[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017:1251-1258.
[28] Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2015:1-9.
[29] Sanderson C, Lovell B C. Multi-region probabilistic histograms for robust and scalable identity inference[C]//International conference on biometrics. Springer, 2009:199-208.
[30] Bulat A, Tzimiropoulos G. How far are we from solving the 2d & 3d face alignment problem?(and a dataset of 230,0003d facial landmarks)[C]//IEEE International Conference on Computer Vision, 2017:1021-1030.
[31] Kingma D P, Ba J. Adam:a method for stochastic optimization[DB/OL]. 2017[2017-01-30]. arXiv:1412.6980.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献