Awesome

HiFi_IFDL

This is the source code for our CVPR $2023$: "Hierarchical Fine-Grained Image Forgery Detection and Localization." [Arxiv]

Authors: Xiao Guo, Xiaohong Liu, Zhiyuan Ren, Steven Grosz, Iacopo Masi, Xiaoming Liu

<a name="update"></a> Updates.

[Sep 2024] 👏 The International Journal of Computer Vision (IJCV) has accepted the extended version of HiFi-Net, stay tuned~
[Aug 2024] The HiFi-Net is integrated into the DeepFake-o-meter v2.0 platform, which is a user-friendly public detection tool designed by the University at Buffalo. [DeepFake-o-meter v2.0] [ArXiv]
[Jul. 2024] 👏 ECCV2024 "Deepfake Explainer" paper [ArXiv] reports HiFi-Net's deep fake detection performance and the source code is released [link].
[Sep 2023] The first version dataset can be acquired via this link: Dataset Link
[Sep 2023] The DAPRA-sponsored image forensic demo can be viewed at this link: Demo
[June 2023] The extended version of our work has been submitted to one of the ~~Machine Learning Journals~~ IJCV.
This GitHub will keep updated, please stay tuned~

Short 5 Min Video

Usage on Manipulation Localization (e.g., Columbia, Coverage, CASIA, NIST16 and IMD2020)

To create your environment by
```
conda env create -f environment.yml
```
or mannually install pytorch 1.11.0 and torchvision 0.12.0 in python 3.7.16.
Go to localization_weights_link to download the weights from, and then put them in weights.
To apply the pre-trained model on images in the ./data_dir and then obtain results in ./viz_eval, please run
```
bash HiFi_Net_loc.sh
```
More quantitative and qualitative results can be found at: csv and qualitative results.
If you would like to generate the above result. Download $5$ datasets via link and unzip it by tar -xvf data.tar.gz. Then, uncomment this line and run HiFi_Net_loc.sh.

Usage on Detecting and Localization for the general forged content including GAN and diffusion-generated images:

This reproduces detection and localization results in the HiFi-IFDL dataset (Tab. 2 and Supplementary Fig.1)
Go to HiFi_IFDL_weights_link to download the weights, and then put them in weights.
The quick usage on HiFi_Net:

  from HiFi_Net import HiFi_Net 
  from PIL import Image
  import numpy as np

  HiFi = HiFi_Net()   # initialize
  img_path = 'asset/sample_1.jpg'

  ## detection
  res3, prob3 = HiFi.detect(img_path)
  # print(res3, prob3) 1 1.0
  HiFi.detect(img_path, verbose=True)

  ## localization
  binary_mask = HiFi.localize(img_path)
  binary_mask = Image.fromarray((binary_mask*255.).astype(np.uint8))
  binary_mask.save('pred_mask.png')

Quick Start of Source Code

A quick view of the code structure:

./HiFi_IFDL
    ├── HiFi_Net_loc.py (localization files)
    ├── HiFi_Net_loc.sh (localization evaluation)
    ├── HiFi_Net.py (API for the user input image.)
    ├── IMD_dataloader.py (call dataloaders in the utils folder)
    ├── model (model module folder)
    │      ├── NLCDetection_pconv.py (partial convolution, localization, and classification modules)
    │      ├── seg_hrnet.py (feature extractor based on HRNet)
    │      ├── LaPlacianMs.py (laplacian filter on the feature map)
    │      ├── GaussianSmoothing.py (self-made smoothing functions)
    │      └── ...   
    ├── utils (utils, dataloader, and localization loss class.)
    │      ├── custom_loss.py (localization loss class and the real pixel center initialization)
    │      ├── utils.py
    │      ├── load_data.py (loading training and val dataset.)
    │      └── load_edata.py (loading inference dataset.)
    ├── asset (folder contains sample images with their ground truth and predictions.)
    ├── weights (put the pre-trained weights in.)
    ├── center (The pre-computed `.pth` file for the HiFi-IFDL dataset.)
    └── center_loc (The pre-computed `.pth` file for the localization task (Tab.3 in the paper).)

Question and Answers.

Q1. Why train and val datasets are in the same path?

A1. For each forgery method, we save both train and val in the SAME folder, from which we use a text file to obtain the training and val images. The text file contains a list of image names, and the first val_num are used for training and the last "val_num" for validation. Specifically, refer to code for details. What is more, we build up the code on the top of the PSCC-Net, which adapts the same style of loading data, please compare code1 with code2.

Q2. What is the dataset naming for STGAN and the face-shifter section?

A2. Please check the STGAN.txt in this link, which contains all manipulated/modified images we have used for training and validation. This txt file will be loaded by this line of code, which says about the corresponding masks. Lastly, I am not sure if I have release the authentic images, if I do not, you can simply find them in the public celebAHQ dataset. I will try to offer the rigid naming for the dataset in the near future.

Reference

If you would like to use our work, please cite:

@inproceedings{hifi_net_xiaoguo,
  author = { Xiao Guo and Xiaohong Liu and Zhiyuan Ren and Steven Grosz and Iacopo Masi and Xiaoming Liu },
  title = { Hierarchical Fine-Grained Image Forgery Detection and Localization },
  booktitle = { CVPR },
  year = { 2023 },
}