Home

Awesome

Detect-LAIM-generated-Multimedia-Survey

This repository contains a collection of resources and papers on Detecting Multimedia Generated by Large AI Models: A Survey

<figure> <img src="assets/timeline.png" alt="timeline"> <figcaption>A cat-and-mouse game between generating and detecting multimedia (<span style="color:textcolor;background-color: #97D077; padding: 2px 4px;">text</span>, <span style="color: imagecolor; background-color: #FF9999; padding: 2px 4px;">image</span>, <span style="color: videocolor; background-color: #FF8000; padding: 2px 4px;">video</span>, <span style="color: audiocolor; background-color: #CDA2BE; padding: 2px 4px;">audio</span>, and <span style="color: mmcolor; background-color: #FFCE9F; padding: 2px 4px;">multimodal</span>) using LAIMs, showcasing only representative works. Q1 represents from Jan to Mar, Q2: Apr-Jun, Q3: Jul-Sep, Q4: Oct-Dec.</figcaption> </figure>

Please let us know if you find a mistake, or if we have missed your wonderful work by e-mail: lin1785@purdue.edu, hu968@purdue.edu, gupt1031@purdue.edu

If you find our survey useful for your research, please cite the following Paper

@article{lin2024detecting,
  title={Detecting Multimedia Generated by Large AI Models: A Survey},
  author={Lin, Li and Gupta, Neeraj and Zhang, Yue and Ren, Hainan and Liu, Chun-Hao and Ding, Feng and Wang, Xin and Li, Xin and Verdoliva, Luisa and Hu, Shu},
  journal={arXiv preprint arXiv:2402.00045},
  year={2024}
}

💻 Contents

📈 Related Work

         - A Survey on Detection of LLMs-Generated Content Paper GitHub

         - A Survey on LLM-generated Text Detection: Necessity, Methods, and Future Directions Paper GitHub

         - Towards possibilities & impossibilities of ai-generated text detection: A survey Paper

         - Machine-generated text: A comprehensive survey of threat models and detection methods Paper

         - The Age of Synthetic Realities: Challenges and Opportunities Paper

         - GenAI against humanity: Nefarious applications of generative artificial intelligence and large language models Paper

Generation

<figure> <img src="assets/generation.png" alt="Generation Processes"> <figcaption>Illustrations of different types of multimedia generation process based on LAIMs.</figcaption> </figure>

Public Datasets for Detection

Please read the column I20(Input-to-Output) with these abbreviations:

ModalityDatasetContentLinkI2O#Real#GeneratedSource of Real MediaGeneration MethodYear
TextStudent EssaysEssaysLinkT2T1,0006,000IvyPandaChatGPT2023
TextCreative WritingEssaysLinkT2T1,0006,000Reddit WritingPromptsChatGPT2023
TextNews ArticlesEssaysLinkT2T1,0006,000Reuters 50-50ChatGPT2023
TextParaphraseEssaysLinkT2T98,280163,710Arxiv, Wikipedia, ThesesGPT-3, T52022
TextAuthorship AttributionEssaysLinkT2T1,0648,512News MediaVarious GPT, CTRL, GROVER, etc.2020
TextOUTFOXEssaysLinkT2T15,40015,400Feedback PrizeChatGPT, GPT-3.5, T52023
TextMULTITuDENewsLinkT2T7,99266,089MassiveSummGPT-3, GPT-4, ChatGPT2023
TextTuringBenchNewsLinkT2T8,854159,758News MediaVarious GPT, CTRL, GROVER, etc.2021
TextGPT-3.5 unmixedNewsLinkT2T5,4545,454News MediaGPT-3.52023
TextGPT-3.5 mixedNewsLinkT2T5,0325,032News MediaGPT-3.52023
TextGPABenchmarkWritingLinkT2T150,000450,000ArxivGPT-3.52023
TextHPPTAbstractsLinkT2T6,0506,050ACL AnthologyChatGPT2023
TextTweepFakeTweetsLinkT2T12,78612,786GitHub, TwitterGPT-2, RNN, LSTM2021
TextSynSciPassPassagesLinkT2T99,98910,485Scientific papersGPT-2, BLOOM2022
TextDeepfake-TextDetectGeneralLinkT2T154,078294,381Various sources including Reddit, ELI5, Yelp, etc.Various including GPT, GLM, LLAIMA, T5, OPT2022
TextHC-VarGeneralLinkT2T90,09645,000Various including XSum, IMDb, Yelp, Reddit, etc.ChatGPT2023
TextHC3GeneralLinkT2T26,90358,546Various including FiQA, Wiki, ELI5, etc.ChatGPT2023
TextM4GeneralLinkT2T32,79889,683Various including Wikipedia, WikiHow, Arxiv, etc.Various including ChatGPT, GTP-3.5, LLAIMA, T5, Dolly-v2, etc.2023
TextMixSetGeneralLinkT2T3003,600Enron Email, Steam Reviews, BBC News, ArXiv-10, TED Talk, BlogLLaMA2, GPT-42024
TextInternVidCaptionsLinkV2T7,000,000234,000,000YouTubeViCLIP2023
ImageDFFFaceLinkT2I/I2I30,00090,000IMDB-WIKISDMs, InsightFace2023
ImageDiFFFaceLinkT2I/I2I2,500500,000CelebA, Prompts16 DMs2024
ImageGANDiffFaceFaceLinkT/I2I-73293FFHQStyleGAN3, DreamBooth2023
ImageRealFacesFaceLinkT2I25825,800PromptsSDMs2023
ImageDCFaceFaceLinkI2I-1,200,000FFHQ, CASIA-WebFaceDDPM2023
ImageIDiff-FaceFaceLinkI2I-500,000FFHQDDPM2023
ImageOverheadImgOverheadLinkT2I/I2I6,4756,675MapBox, Google MapsGLIDE, DDPM2023
ImageSynthbusterGeneralLinkT2I-9,000Raise-1kDALL·E 2&3, Firefly, Midjourney, SDMs, SDMs2023
ImageGenImageGeneralLinkT2I/I2I1,331,1671,350,000ImageNetVarious methods including SDMs, Midjourney, BigGAN, VQDM2023
ImageCIFAKEGeneralLinkT2I60,00060,000CIFAR-10SD-V1.42023
ImageAutoSpliceGeneralLinkT2I2,2733,621Visual NewsDALL·E-22023
ImageDiffusionDBGeneralLinkT2I3,300,00016,000,000DiscordChatExporterSD2023
ImageArtiFactGeneralLinkT2I/I2I964,9891,531,749Various sources including AFHQ, CelebAHQ, COCO, etc.Various methods including SDMs, VQDM, DDPM, LDM, etc.2023
ImageHiFi-IFDLGeneralLinkT2I/I2I~600,0001,300,000Various sources including FFHQ, AFHQ, CelebAHQ, etc.Various methods including DDPM, DDIM, GLIDE, LDM, etc.2023
ImageDiffusionForensicsGeneralLinkT2I/I2I232,000232,000LSUN, ImageNetVarious methods including LDM, DDPM, iDDPM, VQDM, ADM, PNDM2023
ImageCocoGlideGeneralLinkT2I512512COCOGLIDE2023
ImageWestern BlotGeneralLinkI2I~14,000~24,000Western BlotDDPM, Pix2pix, CycleGAN2022
ImageM3DsynthGeneralLinkI2I1,0188,577LIDC-IDRIDDPM, Pix2pix, CycleGAN2023
ImageLSUNDBGeneralLinkT2I/I2I250,000250,000LSUNVarious methods including DDPM, PNDM, LDM, ADM, ProjectedGAN, StyleGAN, DiffusionGAN2023
ImageUniversalFakeGeneralLinkT2I8,0008,000LAION-400MLDM, GLIDE2023
ImageREGMGeneralLinkT2I/I2I-116,000CelebA, LSUN116 publicly available GMs2023
ImageDMimageGeneralLinkT2I200,000200,000COCO, LSUNLDM2022
ImageAIGCDGeneralLinkT2I/I2I360,000508,500Various sources including LSUN, ImageNet, CelebA, COCO, FFHQVarious methods including SDMs, GANs, Midjourney, VQDM, ADM, DALL·E-2, GLIDE, WFIR, Wukong2023
ImageDIFGeneralLinkT2I/I2I84,30084,300LAION-5BVarious methods including SDMs, DALL·E-2, Midjourney, GLIDE, GANs2023
ImageFake2MGeneralLinkT2I/I2I-2,300,000CC3MSD-V1.5, IF, StyleGAN32023
VideoDiffused-headFaceLinkI.A2V-820CREMADiffused Heads: build on DDPM2023
AudioLibriSeVocSpeechLinkT2A13,20179,206LibriTTSVarious methods including DiffWave, WaveNet, WaveRNN, Mel-GAN, WaveGrad2023
Multi-modal$DGM^4$NewsLinkT2T/I2I77,426152,574Visual NewsVarious methods including B-GST, StyleCLIP, HFGI, InfoSwap, SimSwap2023
Multi-modalCOCOFakeGeneralLinkT2T/T2I113,287566,435COCOSDMs2023

:mag_right: Detection :fire:

<p align="center">:page_facing_up: Text </p>


Pure Detection

<figure> <img src="assets/text_pure.png" alt="text_pure"> <figcaption style="text-align: center;">Illustrations of pure detection methodologies for LAIM-generated text.</figcaption> </figure>

  ♣️ Easy Explainable Methods

        ▶️ Watermarking

         - Distillation-Resistant Watermarking for Model Protection in NLP Paper

         - Three bricks to consolidate watermarks for large language models Paper GitHub

         - Robust multi-bit natural language watermarking through invariant features Paper

         - Undetectable Watermarks for Language Models Paper

         - Robust distortion-free watermarks for language models Paper

         - Provable robust watermarking for ai-generated text Paper GitHub

         - A Private Watermark for Large Language Models Paper

        ▶️ Artifacts

         - Unraveling the mystery of artifacts in machine generated text Paper

        ▶️ Stylometry/Coherence

         - Stylometric detection of ai-generated text in twitter timelines Paper

         - CoCo: Coherence-Enhanced Machine-Generated Text Detection Under Data Limitation With Contrastive Learning Paper

  ♣️ Hard Explainable Methods

        ▶️ Perplexity

         - HowkGPT: Investigating the Detection of ChatGPT-generated University Student Homework through Context-Aware Perplexity Analysis Paper

         - GPTZero Tool

        ▶️ Log Probabilities Curvature

         - Detectgpt: Zero-shot machine-generated text detection using probability curvature Paper GitHub

        ▶️ Efficient Perturbations

         - Efficient Detection of LLM-generated Texts with a Bayesian Surrogate Model Paper

         - Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature Paper

         - DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text Paper GitHub

        ▶️ Positive Unlabeled

         - Multiscale Positive-Unlabeled Detection of AI-Generated Texts Paper GitHub

Beyond Detection

<figure> <img src="assets/text_beyond.png" alt="text_beyond"> <figcaption style="text-align: center;">Illustrations of beyond detection methodologies for LAIM-generated text. </figcaption> </figure>

  ♣️ Attribution

        ▶️ Deep-learning Based

         - TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation Paper Turingbench

         - Whodunit? Learning to Contrast for Authorship Attribution Paper

         - Through the looking glass: Learning to attribute synthetic text generated by language models Paper

         - TopRoBERTa: Topology-Aware Authorship Attribution of Deepfake Texts Paper

        ▶️ Stylometric/Statistical

         - Authorship attribution for neural text generation Paper GitHub

         - Gpt-who: An information density-based machine-generated text detector Paper

        ▶️ Perplexity

         - LLMDet: A Third Party Large Language Models Generated Text Detection Tool Paper GitHub

        ▶️ Style Representation

         - Few-Shot Detection of Machine-Generated Text using Style Representations Paper

        ▶️ Origin Tracing

         - Origin Tracing and Detecting of LLMs Paper

  ♣️ Generalization

        ▶️ Structured Search

         - Ghostbuster: Detecting Text Ghostwritten by Large Language Models Paper

        ▶️ Contrastive Learning

         - Conda: Contrastive domain adaptation for ai-generated text detection Paper GitHub

  ♣️ Interpretability

        ▶️ N-gram Overlaps

         - DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text Paper GitHub

        ▶️ P-values

         - A Watermark for Large Language Models Paper GitHub

        ▶️ Shapley Additive Explanations

         - Chatgpt or human? detect and explain. explaining decisions of machine learning model for detecting short chatgpt-generated text Paper

         - Check Me If You Can: Detecting ChatGPT-Generated Academic Writing using CheckGPT Paper

        ▶️ Polish Ratio

         - Is chatgpt involved in texts? measure the polish ratio to detect chatgpt-generated text Paper

  ♣️ Robustness

        ▶️ Adversarial Data Augmentation

         - Is chatgpt involved in texts? measure the polish ratio to detect chatgpt-generated text Paper

         - Red Teaming Language Model Detectors with Language Models Paper

         - MGTBench: Benchmarking Machine-Generated Text Detection Paper GitHub

        ▶️ Adversarial Learning

         - Radar: Robust ai-text detection via adversarial learning Paper Project Page

         - Outfox: Llm-generated essay detection through in-context learning with adversarially generated examples Paper

        ▶️ Stylistic/Consistency

         - J-guard: Journalism guided adversarially robust detection of ai-generated news Paper

         - Intrinsic Dimension Estimation for Robust Detection of AI-Generated Texts Paper

  ♣️ Empirical Study

        ▶️ Generalization/Robustness

         - ChatLog: Recording and Analyzing ChatGPT Across Time Paper GitHub

         - On the Zero-Shot Generalization of Machine-Generated Text Detectors Paper

         - On the Generalization of Training-based ChatGPT Detection Methods Paper

         - Supervised Machine-Generated Text Detectors: Family and Scale Matters Paper GitHub

         - Deepfake Text Detection in the Wild Paper GitHub

        ▶️ Human Evaluation

         - How close is chatgpt to human experts? comparison corpus, evaluation, and detection Paper GitHub

         - Can LLM-Generated Misinformation Be Detected? Paper GitHub

        ▶️ Attribution

         - From Text to Source: Results in Detecting Large Language Model-Generated Content Paper

        ▶️ Paraphrase Detection

         - How large language models are transforming machine-paraphrased plagiarism Paper

         - Paraphrase Detection: Human vs. Machine Content Paper

        ▶️ Sample Complexity

         - On the Possibilities of AI-Generated Text Detection Paper

<p align="center"> 📸 Image </p>


Pure Detection

<figure> <img src="assets/image_pure.png" alt="image_pure"> <figcaption style="text-align: center;">Illustrations of pure detection methodologies for LAIM-generated image.</figcaption> </figure>

  ♣️ Physical/Physiological based Methods

         - Qualitative Failures of Image Generation Models and Their Application in Detecting Deepfakes Paper

         - Perspective (in) consistency of paint by text Paper

         - Lighting (in) consistency of paint by text Paper

  ♣️ Diffuser Fingerprints based Methods

         - Deep Image Fingerprint: Accurate And Low Budget Synthetic Image Detector Paper

         - DIRE for Diffusion-Generated Image Detection Paper GitHub

         - Exposing the Fake: Effective Diffusion-Generated Images Detection Paper

  ♣️ Spatial-based Methods

         - Rich and Poor Texture Contrast: A Simple yet Effective Approach for AI-generated Image Detection Paper Project Page

         - Unmasking The Artist: Discriminating Human-Drawn And AI-Generated Human Face Art Through Facial Feature Analysis Paper

         - Detecting images generated by deep diffusion models using their local intrinsic dimensionality Paper

  ♣️ Frequency-based Methods

         - Wavelet-packets for deepfake image analysis and detection Paper GitHub

         - AUSOME: authenticating social media images using frequency analysis Paper

         - AI-Generated Image Detection using a Cross-Attention Enhanced Dual-Stream Network Paper

         - Synthbuster: Towards Detection of Diffusion Model Generated Images Paper

Beyond Detection

<figure> <img src="assets/image_beyond.png" alt="image_beyond"> <figcaption style="text-align: center;">Illustrations of beyond detection methodologies for LAIM-generated image.</figcaption> </figure>

  ♣️ Attribution and Model Parsing

        ▶️ Attribution

         - Level up the deepfake detection: a method to effectively discriminate images generated by gan architectures and diffusion models Paper

        ▶️ Model Parsing

         - Reverse engineering of generative models: Inferring model hyperparameters from generated images Paper

  ♣️ Generalization

         - Online Detection of AI-Generated Images Paper

         - Towards universal fake image detectors that generalize across generative models Paper GitHub

         - Raising the Bar of AI-generated Image Detection with CLIP Paper

         - Transcending Forgery Specificity with Latent Space Augmentation for Generalizable Deepfake Detection Paper

         - Fingerprintnet: Synthesized fingerprints for generated image detection Paper

         - Detecting Deepfakes Without Seeing Any Paper GitHub

         - Improving Synthetically Generated Image Detection in Cross-Concept Settings Paper

         - Diffusion Noise Feature: Accurate and Fast Generated Image Detection Paper

  ♣️ Interpretability

         - Interpretable-through-prototypes deepfake detection for diffusion models Paper GitHub

  ♣️ Localization

        ▶️ Fully-supervised

         - Hierarchical fine-grained image forgery detection and localization Paper GitHub

         - Perceptual Artifacts Localization for Image Synthesis Tasks Paper GitHub

         - TruFor: Leveraging all-round clues for trustworthy image forgery detection and localization Paper GitHub

        ▶️ Weakly-supervised

         - Weakly-supervised deepfake localization in diffusion-generated images Paper

  ♣️ Robustness

        ▶️ Spatial-based

         - GLFF: Global and Local Feature Fusion for AI-synthesized Image Detection Paper

         - Exposing fake images generated by text-to-image diffusion models Paper

         - Local Statistics for Generative Image Detection Paper

        ▶️ Frequency-based

         - D4: Detection of Adversarial Diffusion Deepfakes Using Disjoint Ensembles Paper

  ♣️ Empirical Study

         - On the detection of synthetic images generated by diffusion models Paper GitHub

         - Intriguing properties of synthetic images: from generative adversarial networks to diffusion models Paper

         - Towards the detection of diffusion model deepfakes Paper

         - Unveiling the Impact of Image Transformations on Deepfake Detection: An Experimental Analysis Paper

         - On the use of Stable Diffusion for creating realistic faces: from generation to detection Paper

         - Finding AI-Generated Faces in the Wild Paper

         - Forensic analysis of synthetically generated western blot images Paper

         - Beyond Human Forgeries: An Investigation into Detecting Diffusion-Generated Handwriting Paper

         - Organic or Diffused: Can We Distinguish Human Art from AI-generated Images? Paper

<p align="center">🎞️ Video</p>


<p align="center"> <img src="assets/video_detection.png" alt="Video Detection" width="400">

<span>Illustration of detection methodology in generalization task for LAIM-generated video. </span>

</p>

Beyond Detection

  ♣️ Generalization

         - Revisiting Generalizability in Deepfake Detection: Improving Metrics and Stabilizing Transfer Paper

<p align="center">🎵 Audio</p>


Pure Detection

<p align="center"> <img src="assets/audio.png" alt="Audio Detection">

<span>The artifacts introduced by DM-based neural vocoders (WaveGrad and DiffWave) to a voice signal. The differences in mel-spectrograms between real and generated ones are illustrated in the third and fifth columns.</span>

</p>

  ♣️ Vocoder-based

         - AI-Synthesized Voice Detection Using Neural Vocoder Artifacts Paper GitHub

<p align="center">🍯 Multimodal</p>


Pure Detection

<p align="center"> <img src="assets/multimodal_pure.png" alt="Multimodal Detection" >

<span>Illustrations of pure detection methodologies for LAIM-generated multimodal media.</span>

</p>

  ♣️ Text-assisted

         - Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images Paper

  ♣️ Text-image Inconsistency

         - Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News Paper GitHub

         - Exposing Text-Image Inconsistency Using Diffusion Models Paper

Beyond Detection

<p align="center"> <img src="assets/multimodal_beyond.png" alt="Multimodal Detection">

<span>Illustrations of beyond detection methodologies for LAIM-generated multimodal media.</span>

</p>

  ♣️ Attribution

         - De-fake: Detection and attribution of fake images generated by text-to-image generation models Paper

  ♣️ Generalization

        ▶️ Prompt Tuning

         - AntifakePrompt: Prompt-Tuned Vision-Language Models are Fake Image Detectors Paper GitHub

        ▶️ Contrastive Learning

         - Generalizable Synthetic Image Detection via Language-guided Contrastive Learning Paper GitHub

  ♣️ Interpretability

         - Combating Misinformation in the Era of Generative AI Models Paper

  ♣️ Localization

        ▶️ Spatial-based

         - Detecting and grounding multi-modal media manipulation Paper

         - Exploiting Modality-Specific Features For Multi-Modal Manipulation Detection And Grounding Paper

        ▶️ Frequency-based

         - Unified Frequency-Assisted Transformer Framework for Detecting and Grounding Multi-Modal Manipulation Paper

  ♣️ Empirical Study

         - Detecting Images Generated by Diffusers Paper GitHub

         - CLIPping the Deception: Adapting Vision-Language Models for Universal Deepfake Detection Paper

         - VERITE: a Robust benchmark for multimodal misinformation detection accounting for unimodal bias Paper GitHub

         - Can ChatGPT Detect DeepFakes? A Study of Using Multimodal Large Language Models for Media Forensics Paper

Detection Tools

ModalityToolCompanyLink
TextAI Content DetectorCopyleaksLink
TextAI Content Detector, ChatGPT detectorZeroGPTLink
TextAI Content DetectorWinston AILink
TextAI Content DetectorCrossplagLink
TextGiant Language model Test RoomGLTRLink
TextThe AI DetectorContent at ScaleLink
TextAI CheckerOriginality aiLink
TextAdvanced AI Detector and HumanizerUndetectable aiLink
TextAI Content DetectorWriterLink
TextAI Content DetectorConchLink
TextIlluminarty TextIlluminartyLink
TextAI-Generated Text DetectorIs it AILink
TextAI Detector Efficacy Research ToolOriginality aiLink
ImageAI or Not imageAI or NotLink
ImageAI-Generated Image DetectorIs it AILink
ImageIlluminarty ImageIlluminartyLink
ImageSynthIDGoogleLink
ImageAdvanced AI Image DetectorContent at ScaleLink
ImageAI Image DetectorHuggingfaceLink
AudioAI or Not audioAI or NotLink