Home

Awesome

Masked Images Are Counterfactual Samples for Robust Fine-tuning

This repository is the official PyTorch implementation of "Masked Images Are Counterfactual Samples for Robust Fine-tuning" [paper], accepted by CVPR 2023.

Updates

Setups

0. System environment

Our experiments are conducted on:

1. Python environment

2. Prepare datasets

The data directory (DATA_DIR) should contain the following sub-directories:

3. Setup directories in run.sh

Please modify line 3-6 of the main script run.sh to set the proper directories:

Code usage

The bash script run.sh provides a uniform and simplified interface of the Python scripts for training and evaluation, which accepts the following arguments:

The following commands show an example of fine-tuning a CLIP ViT-B/32 model with our proposed method, using object-mask (threshold 0.3) & single-fill. Please refer to example.sh for more examples.

# Build the zero-shot model
CUDA_VISIBLE_DEVICES=0 bash run.sh train 'clip_ViT-B/32' 'zeroshot' '' 0
# Fine-tune using our approach
CUDA_VISIBLE_DEVICES=0,1,2,3 bash run.sh train 'clip_ViT-B/32' 'FT_FD_image_mask' 'ObjMaskSingleFill(0.3)' 0
# Evaluate the fine-tuned model (replace `train` by `eval`)
CUDA_VISIBLE_DEVICES=0,1,2,3 bash run.sh eval 'clip_ViT-B/32' 'FT_FD_image_mask' 'ObjMaskSingleFill(0.3)' 0

Results

(WIP)

Acknowledgement

Some of the code in this repository is based on the following repositories: