Home

Awesome

<h2>WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection</h2> <p align="center"> <img src="./fakemask.jpg" width="720" height="435px" alt="Deepfake in the Wild" title="Deepfake in the Wild" align="center"></img> </p>

๐Ÿ“Œ Dataset Description

Existing deepfake datasets like DeepfakeDetection and FaceForensics++ have advanced detection research but are limited by constrained real videos featuring a few actors and fake videos generated using popular software. As a result, detectors trained on these datasets often struggle with the diversity of real-world deepfakes found online.

To address this, we introduce WildDeepfake, a dataset of 7,314 face sequences from 707 deepfake videos sourced entirely from the internet. Despite its small size, WildDeepfake better represents the challenges of real-world detection, where baseline detectors show significantly reduced performance.

To enhance detection, we also propose Attention-based Deepfake Detection Networks (ADDNets), utilizing 2D and 3D attention mechanisms to improve focus on real/fake facial features.

๐Ÿ“‚ Dataset Contents

  1. A comparision to previous datasets (before our work)
Dataset nameDownloadGenerate methodDeepfake videosActors
Deepfake-TIMIT lowdownloadDeepfake32032
Deepfake-TIMIT highdownloadDeepfake32032
Faceforensics-Deepfake1000977
Faceforensics++downloadDeepfake1000977
Deepfake detectiondownloadDeepfakeover300028
Celeb-deepfakeforensics v1downloadDeepfake79513
Celeb-deepfakeforensics v2downloadDeepfake59059
DFDCdownloadDeepfake--
WildDeepfakedownloadInternet707Unknown
  1. File Structure:
deepfake_in_the_wild
                    |--real train
                                 |--0.tar.gz
                                 |--1.tar.gz
                                 |--2.tar.gz
                                 ...
                    |--real test
                                |--0.tar.gz
                                |--1.tar.gz
                                |--2.tar.gz
                                ...
                    |--fake train
                                 |--0.tar.gz
                                 |--1.tar.gz
                                 |--2.tar.gz
                                 ...
                    |--fake test
                                |--0.tar.gz
                                |--1.tar.gz
                                |--2.tar.gz
                                ...

In each tar.gz file, there will be several folders containing face images, and the images in each folder represent a face sequence. The image name in the folder represents the frame number it appears in the original video.

<!--- ### Our ADDNet-2D Detection Method <p align="center"> <img src="./ADDNet.png" alt="ADDNet" title="ADDNet" align="center"></img> </p> The network structure of our proposed ADDNet-2D is illustrated below. Detailed structures of the three residual blocks used in our ADDNet-2D network are shown below too. These three blocks are also the building blocks of XceptionNet. The base network before the "resblock3" is our proposed ADDblock. Our ADDNet-3D shares the same ADD block architecture as ADDNet-2D, but has one ADD block for each of the face images in the sequence. Therefore, in our setting with face sequence length ๐ฟ, ADDNet-3D will have ๐ฟ ADDblocks, and each ADD blocks share the same weights. Also different from ADDNet-2D, the classifier network(structure after the ADDblock) of ADDNet-3D is a 3D CNN. <p align="center"> <img src="./details.jpg" alt="details" title="detials" align="center"></img> </p> ### Expriments First, we use pre-trained [Resnet-101](https://github.com/tensorflow/models/tree/master/research/slim) to extract features from the images in previous datasets and our dataset. Then we use the [T-SNE](http://projector.tensorflow.org/) to reduce the dimensionality. Red points represent fake faces, green points represent real faces. Here is the comparison: <p align="center"> <img src="./t-sne.PNG" alt="t-sne" title="t-sne" align="center"></img> </p> --->

โฌ‡๏ธ Request for Download

You will need to fill an agreement form to use the dataset, which is now avalibble on Hugging Face click to download.

๐Ÿ“œ Cite Us

If you use this dataset in your research, please cite it as follows:

@inproceedings{zi2020wilddeepfake,
  title={Wilddeepfake: A challenging real-world dataset for deepfake detection},
  author={Zi, Bojia and Chang, Minghao and Chen, Jingjing and Ma, Xingjun and Jiang, Yu-Gang},
  booktitle={Proceedings of the 28th ACM International Conference on Multimedia},
  pages={2382--2390},
  year={2020}
}

๐Ÿ“ Privacy Statement

To ensure the privacy of individuals featured in the dataset, we have implemented the following measures:

We are committed to safeguarding privacy while enabling research advancements.