Awesome

Paper

This repository contains the implementation code for the ECCV 2024 accepted paper:

Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection (available at arXiv:2402.19091)

<u>Christos Koutlis</u>, <u>Symeon Papadopoulos</u>

Figure 1. The RINE architecture. A batch of $b$ images is processed by CLIP's image encoder. The concatenation of the $n$ $d$-dimensional CLS tokens (one from each Transformer block) is first projected and then multiplied with the blocks' scores, estimated by the Trainable Importance Estimator (TIE) module. Summation across the second dimension results in one feature vector per image. Finally, after the second projection and the consequent classification head modules, two loss functions are computed. Binary cross-entropy $\mathfrak{L}_{CE}$ directly optimizes SID, while the contrastive loss $\mathfrak{L}_{Cont.}$ assists the training by forming a dense feature vector cluster per class.

News

:tada: 4/7/2024 Paper acceptance at ECCV 2024

:sparkles: 29/2/2024 Pre-print release --> arXiv:2402.19091

:boom: 29/2/2024 Code and checkpoints release

Setup

Clone the repository:

git clone https://github.com/mever-team/rine

Create the environment:

conda create -n rine python=3.9
conda activate rine
conda install pytorch==2.1.1 torchvision==0.16.1 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -r requirements.txt

Store the datasets in data/:

Download the ProGAN training & validation, and the GAN-based, Deepfake, Low-level-vision, and Perceptual loss test sets as desrcibed in https://github.com/PeterWang512/CNNDetection
Download the Diffusion test data as desrcibed in https://github.com/Yuheng-Li/UniversalFakeDetect
Download the Latent Diffusion Training data as described in https://github.com/grip-unina/DMimageDetection
Download the Synthbuster dataset as desrcibed in https://zenodo.org/records/10066460
Download the MSCOCO dataset https://cocodataset.org/#home

The data/ directory should look like:

data
└── coco
└── latent_diffusion_trainingset
└── RAISEpng
└── synthbuster
└── train
      ├── airplane	
      │── bicycle
      |     .
└── val
      ├── airplane	
      │── bicycle
      |     .
└── test					
      ├── progan	
      │── cyclegan   	
      │── biggan
      │      .
      │── diffusion_datasets
                │── guided
                │── ldm_200
                |       .

Evaluation

To evaluate the 1-class, 2-class, and 4-class chechpoints as well as the LDM-trained model provided in ckpt/ run python scripts/validation.py. The results will be displayed in terminal.

To get all the reported results (figures, tables) of the paper run python scripts/results.py.

Re-run experiments

To reproduce the conducted experiments, re-run in the following order:

the 1-epoch hyperparameter grid experiments with python scripts/experiments.py
the ablation study with python scripts/ablations.py
the training duration experiments with python scripts/epochs.py
the training set size experiments with python scripts/dataset_size.py
the perturbation experiments with python scripts/perturbations.py
the LDM training experiments with python scripts/diffusion.py

Finally, to save the best 1-class, 2-class, and 4-class models (already stored in ckpt/) run python scripts/best.py, that re-trains the best configurations and stores the corresponding trainable model parts.

With this code snippet the whole project can be reproduced:

import subprocess

subprocess.run("python scripts/experiments.py", shell=True)
subprocess.run("python scripts/ablations.py", shell=True)
subprocess.run("python scripts/epochs.py", shell=True)
subprocess.run("python scripts/dataset_size.py", shell=True)
subprocess.run("python scripts/perturbations.py", shell=True)
subprocess.run("python scripts/diffusion.py", shell=True)
subprocess.run("python scripts/best.py", shell=True)
subprocess.run("python scripts/validation.py", shell=True)
subprocess.run("python scripts/results.py", shell=True)

Demo

In demo/, we also provide code for inference on one real and one fake image from the DALL-E generative model. To demonstrate run python demo/demo.py.

Citation

@InProceedings{10.1007/978-3-031-73220-1_23,
author="Koutlis, Christos
and Papadopoulos, Symeon",
editor="Leonardis, Ale{\v{s}}
and Ricci, Elisa
and Roth, Stefan
and Russakovsky, Olga
and Sattler, Torsten
and Varol, G{\"u}l",
title="Leveraging Representations from Intermediate Encoder-Blocks for Synthetic Image Detection",
booktitle="Computer Vision -- ECCV 2024",
year="2025",
publisher="Springer Nature Switzerland",
address="Cham",
pages="394--411",
isbn="978-3-031-73220-1"
}

Contact

Christos Koutlis (ckoutlis@iti.gr)