Home

Awesome

PolyGlotFake Dataset

Overview

PolyGlotFake is a novel multilingual and multimodal deepfake dataset meticulously designed to address the challenges and demands of deepfake detection technologies. It consists of videos with manipulated audio and visual components across seven languages, employing advanced Text-to-Speech, voice cloning, and lip-sync technologies.

Download DataSet

Please fill out this form to request access to the PolyGlotFake Dataset. We will review your request and respond as soon as possible.

Quantitative Comparison

DataSetRelease DataManipulated ModalityMultilingualReal videoFake videoTotal videoManipulation MethodsTechniques LabelingAttribute Labeling
UADFV2018VNo4949981NoNo
TIMI2018VNo3206409602NoNo
FF++2019VNo1,0004,0005,0004NoNo
DFD2019VNo3603,0683,4315NoNo
DFDC2020A/VNo23,654104,500128,1548NoNo
DeeperForensics2020VNo50,00010,00060,0001NoNo
Celeb-DF2020VNo5905,6396,2291NoNo
FFIW2020VNo10,00010,00020,0001NoNo
KoDF2021VNo62,166175,776237,9425NoNo
FakeAVCeleb2021A/VNo50019,50020,0004NoYes
DF-Platter2023VNo133,260132,496265,7563NoYes
PolyGlotFake2023A/VYes76614,47215,23810YesYes

Dataset Details

Composition

Languages and Synthesis Methods Distribution

<p float="left"> <img src="./images/lang_new.jpg" width="45%" /> <img src="./images/tech.jpg" width="45%" /> </p>

Generation Pipeline

Generation Pipeline

Deepfake Detection Benchmark

Evaluation Results and Comparisons

TypeDetectorBackboneFakeAVCelebDFDCPolyGlotFake
NaiveMesoNetDesigned0.73320.59060.5672
NaiveMesoInceptionDesigned0.79450.63440.5831
NaiveXceptionXception0.91690.65300.6052
NaiveEfficienNet-B4EfficienNet0.90230.60200.5769
SpatialCapsuleCapsule0.86630.61460.6068
SpatialFFDXception0.92850.65830.5960
SpatialCOREXception0.93450.66250.6220
SpatialRECCEDesigned0.93960.68840.6596
SpatialDSP-FWAXception0.91150.69290.6658
FrequencyF3NetXception0.94160.64520.6439
FrequencySRMXception0.90430.63460.6143
EnsembleXResDesigned0.95560.70420.6835

Visualization

Overview of Dataset

Ethics Statement

Access to the dataset is restricted to academic institutions and is intended solely for research use. It complies with YouTube's fair use policy through its transformative, non-commercial use, by including only brief excerpts (approximately 20 seconds) from each YouTube video, and ensuring that these excerpts do not adversely affect the copyright owners' ability to earn revenue from their original content. Should any copyright owner feel their rights have been infringed, we are committed to promptly removing the contested material from our dataset.

Citation

@misc{hou2024polyglotfake,
      title={PolyGlotFake: A Novel Multilingual and Multimodal DeepFake Dataset}, 
      author={Yang Hou and Haitao Fu and Chuankai Chen and Zida Li and Haoyu Zhang and Jianjun Zhao},
      year={2024},
      eprint={2405.08838},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}