Home

Awesome

FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset

Header

Overview

FakeAVCeleb is a novel Audio-Video Multimodal Deepfake Detection dataset (FakeAVCeleb), which contains not only deepfake videos but also respective synthesized cloned audios.

Access (Request form)

If you would like to download the FakeAVCeleb dataset, please fill out the Google request form and, once accepted, we will send you the link to our download script.sh

Once, you obtain the download link, please see the download section in our Dataset site. You can also find details about our FakeAVCeleb dataset.

Requirements and Installation

We recommend the installation using the requilrements.txt contained in this Github.
python==3.8.0
numpy==1.20.3
torch==1.8.0
torchvision==0.9.0
matplotlib==3.3.4
tqdm==4.61.2
scikit-learn
pandas

pip install -r requirements.txt

Deepfake Dataset for Quantitative Comparison

DatasetReal VideosFake VideosTotal VideosRights ClearedAgreeing subjectsTotal subjectsMethodsReal AudioDeepfake AudioFine-grained Labeling
UADFV494998No0491NoNoNo
DeepfakeTIMIT640320960No0322NoYesNo
FF++10004,0005,000No0N/A4NoNoNo
Celeb-DF5905,6396,229No0591NoNoNo
Google DFD03,0003,000Yes28285NoNoNo
DeeperForensics50,00010,00060,000Yes1001001NoNoNo
DFDC23,654104,500128,154Yes9609608YesYesNo
KoDF62,166175,776237,942Yes4034036YesNoNo
FakeAVCeleb50019,50020,000No05004YesYesYes

Training & Evaluation

- Full Usages

  -m                   model name = [MESO4, MESOINCEPTION4, XCEPTION, EFFICIENTB0, F3NET, LIPS, XRAY, HEADPOSE, EXPLOTING, CAPSULE]
  -v                   path of video data
  -a                   path of audio data
  -vm                  path of video model (For evluation)
  -am                  path of audio model (For evluation)
  -sm                  path to save best-model while training
  -l                   learning late (For training)
  -me                  number of epoch (For training)
  -nb                  batch size
  -ng                  gpu device to use (default=0) can be 0,1,2 for multi-gpu
  -vr                  validation ratio on trainset
  -ne                  patient number of early stopping
  -en                  True or False, It would be decided whether ensemble (Only for evaluation)

- Benchmark

To train and evaluate the model(s) in the paper, run this command:

Result

DatasetUADFVDF-TIMIT (LQ)DF-TIMIT (HQ)FF-DFDFDDFDCCeleb-DFFakeAVCeleb
Capsule61.378.474.496.664.053.357.570.9
HeadPose89.055.153.247.356.155.954.649.0
VA-MLP70.261.462.166.469.161.955.067.0
VA-LogReg54.077.077.378.077.266.255.167.9
Xception-raw80.456.754.099.753.949.948.271.5
Xception-comp91.295.994.499.785.972.265.377.3
Meso484.387.868.484.776.075.354.860.9
MesoInception482.180.462.783.075.973.253.661.7
<div style="text-align:center"> <img src="./images/Spectrogram_a1.png" width="400" height="280"/> <img src="./images/Spectrogram_a1fake.png" width="400" height="280"/> </div>

Citation

If you use the FakeAVCeleb data or code please cite:

@misc{khalid2021fakeavceleb,
      title={FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset}, 
      author={Hasam Khalid and Shahroz Tariq and Simon S. Woo},
      year={2021},
      eprint={2108.05080},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Contect

If you have any questions, please contact us at hasam.khalid/shahroz/kimminha@g.skku.edu.

References

[1] Huy H Nguyen, Junichi Yamagishi, and Isao Echizen. Use of a capsule network to detect fake images and videos. arXiv preprint arXiv:1910.12467, 2019.
[2] Xin Yang, Yuezun Li, and Siwei Lyu. Exposing deep fakes using inconsistent head poses. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 8261–8265. IEEE, 2019.
[3] Falko Matern, Christian Riess, and Marc Stamminger. Exploiting visual artifacts to expose deepfakes and face manipulations. In 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), pages 83–92. IEEE, 2019.
[4] Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, andMatthias Nießner. Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1–11, 2019.
[5] Darius Afchar, Vincent Nozick, Junichi Yamagishi, and Isao Echizen. Mesonet: a compact facial video forgery detection network. In 2018 IEEE International Workshop on Information Forensics and Security (WIFS), pages 1–7. IEEE, 2018.
[6] Conrad Sanderson and Brian C Lovell. Multi-region probabilistic histograms for robust and scalable identity inference. In International conference on biometrics, pages 199–208. Springer, 2009.
[7] Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. Celeb-df: A large-scale challenging dataset for deepfake forensics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3207–3216, 2020.
[8] Liming Jiang, Ren Li, Wayne Wu, Chen Qian, and Chen Change Loy. Deeperforensics-1.0: A large-scale dataset for real-world face forgery detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2889–2898, 2020.
[9] Brian Dolhansky, Joanna Bitton, Ben Pflaum, Jikuo Lu, Russ Howes, Menglin Wang, and Cristian Canton Ferrer. The deepfake detection challenge dataset. arXiv preprint arXiv:2006.07397, 2020.
[10] Patrick Kwon, Jaeseong You, Gyuhyeon Nam, Sungwoo Park, and Gyeongsu Chae. Kodf: A large-scale korean deepfake detection dataset. arXiv preprint arXiv:2103.10094, 2021.
[11] Huy H Nguyen, Junichi Yamagishi, and Isao Echizen. Use of a capsule network to detect fake images and videos. arXiv preprint arXiv:1910.12467, 2019.
[12] Xin Yang, Yuezun Li, and Siwei Lyu. Exposing deep fakes using inconsistent head poses. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 8261–8265. IEEE, 2019.
[13] Matern, Falko and Riess, Christian and Stamminger, Marc. Exploiting visual artifacts to expose deepfakes and face manipulations. In 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), 2019.

License

The data can be released under the FakeAVCeleb Request Forms, and the code is released under the MIT license.

Copyright (c) 2021