Home

Awesome

<h2 style="text-align:center;">WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection</h2> <p align="center"> <img src="./fakemask.jpg" width="720" height="435px" alt="Deepfake in the Wild" title="Deepfake in the Wild" align="center"></img> </p>

Dataset Description

Existing deepfake datasets like DeepfakeDetection and FaceForensics++ have advanced detection research but are limited by constrained real videos featuring a few actors and fake videos generated using popular software. As a result, detectors trained on these datasets often struggle with the diversity of real-world deepfakes found online.

To address this, we introduce WildDeepfake, a dataset of 7,314 face sequences from 707 deepfake videos sourced entirely from the internet. Despite its small size, WildDeepfake better represents the challenges of real-world detection, where baseline detectors show significantly reduced performance.

To enhance detection, we also propose Attention-based Deepfake Detection Networks (ADDNets), utilizing 2D and 3D attention mechanisms to improve focus on real/fake facial features.

Dataset Statistics

Previous datasets

Dataset nameDownloadGenerate methodDeepfake videosActors
Deepfake-TIMIT lowdownloadDeepfake32032
Deepfake-TIMIT highdownloadDeepfake32032
Faceforensics-Deepfake1000977
Faceforensics++downloadDeepfake1000977
Deepfake detectiondownloadDeepfakeover300028
Celeb-deepfakeforensics v1downloadDeepfake79513
Celeb-deepfakeforensics v2downloadDeepfake59059
DFDCdownloadDeepfake--

Ours

Dataset nameDownloadGenerate methodDeepfake videosActors
WildDeepfakedownloadInternet707-

File Structure:

deepfake_in_the_wild
                    |--real train
                                 |--0.tar.gz
                                 |--1.tar.gz
                                 |--2.tar.gz
                                 ...
                    |--real test
                                |--0.tar.gz
                                |--1.tar.gz
                                |--2.tar.gz
                                ...
                    |--fake train
                                 |--0.tar.gz
                                 |--1.tar.gz
                                 |--2.tar.gz
                                 ...
                    |--fake test
                                |--0.tar.gz
                                |--1.tar.gz
                                |--2.tar.gz
                                ...

In each tar.gz file, there will be several folders containing face images, and the images in each folder represent a face sequence. The image name in the folder represents the frame number it appears in the original video.

<!--- ### Our ADDNet-2D Detection Method <p align="center"> <img src="./ADDNet.png" alt="ADDNet" title="ADDNet" align="center"></img> </p> The network structure of our proposed ADDNet-2D is illustrated below. Detailed structures of the three residual blocks used in our ADDNet-2D network are shown below too. These three blocks are also the building blocks of XceptionNet. The base network before the "resblock3" is our proposed ADDblock. Our ADDNet-3D shares the same ADD block architecture as ADDNet-2D, but has one ADD block for each of the face images in the sequence. Therefore, in our setting with face sequence length 𝐿, ADDNet-3D will have 𝐿 ADDblocks, and each ADD blocks share the same weights. Also different from ADDNet-2D, the classifier network(structure after the ADDblock) of ADDNet-3D is a 3D CNN. <p align="center"> <img src="./details.jpg" alt="details" title="detials" align="center"></img> </p> ### Expriments First, we use pre-trained [Resnet-101](https://github.com/tensorflow/models/tree/master/research/slim) to extract features from the images in previous datasets and our dataset. Then we use the [T-SNE](http://projector.tensorflow.org/) to reduce the dimensionality. Red points represent fake faces, green points represent real faces. Here is the comparison: <p align="center"> <img src="./t-sne.PNG" alt="t-sne" title="t-sne" align="center"></img> </p> --->

Request for Download

You can fill the form here to request a copy of the dataset. We support both Baidu Drive and Google Drive.

Privacy Statement

To ensure the privacy of individuals featured in the dataset, we have implemented the following measures:

We are committed to safeguarding privacy while enabling research advancements.