Home

Awesome

A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others (CVPR 2023)

Zhiheng Li, Ivan Evtimov, Albert Gordo, Caner Hazirbas, Tal Hassner, Cristian Canton Ferrer, Chenliang Xu, Mark Ibrahim

[paper]

<div align="center"> <img src="assets/teaser.png" width="100%" height="100%"/> </div><br/>

TL; DR: Our benchmark results on UrbanCars and ImageNet reveal the overlooked Whac-A-Mole dilemma in shortcut mitigation, i.e., mitigating one shortcut amplifies the reliance on other shortcuts.

ImageNet-W

We discover the new watermark shortcut in ImageNet. and create ImageNet-W test set to study (1) state-of-the-art vision models' reliance on the watermark shortcut; (2) the reliance on multiple shortcuts on ImageNet when using ImageNet-W along with other out-of-distribution variants of ImageNet (e.g., ImageNet-R).

Use ImageNet-W

pip install imagenet-w
from imagenet_w import AddWatermark
from torchvision.datasets import ImageNet

resize_size = 256
crop_size = 224

test_transform = transforms.Compose(
      [
          transforms.Resize(resize_size),
          transforms.CenterCrop(crop_size),
          transforms.ToTensor(),
          AddWatermark(crop_size),  # insert AddWatermark before normalize
          normalize,
      ]
  )

imagenet_w = ImageNet(root, split="val", transform=test_transform)

Requirements

pip install -r requirements.txt

UrbanCars Experiments

We construct UrbanCars dataset, a new dataset with multiple shortcuts (i.e., background and co-occurring object), facilitating the study of multi-shortcut learning under the controlled setting.

Generate UrbanCars Dataset

bash scripts/prepare_dataset_models/create_urbancars.sh

Train Shortcut Mitigation Methods on UrbanCars

Use shell scripts in scripts/train_urbancars to run each method, e.g.,:

bash scripts/train_urbancars/$METHOD.sh

where $METHOD should be replaced by method names listed in scripts/train_urbancars.


ImageNet Experiments

Prepare ImageNet and its out-of-distribution variants

See prepare_ImageNet.md

Prepare models for evaluating shortcut reliance

See prepare_checkpoints_for_eval.md

Evaluate state-of-the-art vision models' watermark shortcut reliance

PYTHONPATH=.:$PYTHONPATH python eval_shortcuts/eval_watermark_shortcut.py

Evaluate reliance on multiple shortcuts

PYTHONPATH=.:$PYTHONPATH python eval_shortcuts/eval_multiple_shortcuts.py

Training

We use last layer retraining for ImageNet experiments.

PYTHONPATH=.:$PYTHONPATH python imagenet_trainers/launcher.py --method ${METHOD} --amp --feature_extractor resnet50_erm --lr ${LR} [--wandb] [--slurm_partition ${SLURM_PARTITION}] [--slurm_job_name ${METHOD}_imagenet]
methodarchitectureIN-1kIN-W GapCarton GapSIN GapIN-R GapIN-9 GapLRdownload
ERMResNet-5076.39-25.40+30-69.43-56.22-5.191e-3model
MixupResNet-5076.17-24.87+34-68.18-55.79-5.601e-4model
CutMixResNet-5075.90-25.78+32-69.31-56.36-5.651e-4model
CutoutResNet-5076.40-25.11+32-69.39-55.93-5.351e-3model
AugMixResNet-5076.23-23.41+38-68.51-54.91-5.851e-4model
SDResNet-5076.39-26.03+30-69.42-56.36-5.331e-3model
WTM AugResNet-5076.32-5.78+14-69.31-56.22-5.341e-3model
TXT AugResNet-5075.94-25.93+36-63.99-53.24-5.661e-4model
BG AugResNet-5076.03-25.01+36-68.41-54.51-4.671e-4model
LfFResNet-5076.35-26.19+36-69.34-56.02-5.611e-4model
JTTResNet-5076.33-26.40+32-69.48-56.30-5.551e-2model
EIILResNet-5071.51-33.17+24-65.93-61.09-6.271e-4model
DebiANResNet-5076.33-26.40+36-69.37-56.29-5.531e-4model
LLE (ours)ResNet-5076.25-6.18+10-61.00-54.89-3.821e-3model
MAE + LLE (ours)ViT-B83.68-2.48+6-58.78-44.96-3.701e-3model
MAE + LLE (ours)ViT-L85.84-1.74+12-56.32-34.64-2.771e-3model
MAE + LLE (ours)ViT-H86.84-1.11+28-55.69-30.95-2.351e-3model
SWAG + LLE (ours)ViT-B85.37-2.50+8-60.92-28.37-3.191e-4model

In our proposed Last Layer Ensemble (LLE) method, we also use edge detection for data augmentation, i.e., Edge Aug. The details of how to generate edge detection data on ImageNet and the checkpoints are in Edge_Aug.md.

Evaluation

To evaluate the trained models, download the checkpoint from the table above and use its file path as ${PATH_TO_CHECKPOINT}:

PYTHONPATH=.:$PYTHONPATH python imagenet_trainers/launcher.py --method ${METHOD} --amp --feature_extractor resnet50_erm [--wandb] [--slurm_partition ${SLURM_PARTITION}] [--slurm_job_name ${METHOD}_imagenet] --evaluate --resume ${PATH_TO_CHECKPOINT}

<a name="CitingWhacAMole"></a>Citation

If you use UrbanCars dataset or ImageNet-W dataset, or compare with our proposed Last Layer Ensemble (LLE) method, please cite our paper:

@InProceedings{Li_2023_CVPR_Whac_A_Mole,
    author    = {Li, Zhiheng and Evtimov, Ivan and Gordo, Albert and Hazirbas, Caner and Hassner, Tal and Ferrer, Cristian Canton and Xu, Chenliang and Ibrahim, Mark},
    title     = {A Whac-a-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {20071-20082}
}

License

See LICENSE for details.

Attribution

<a href="https://www.flaticon.com/free-icons/whack-a-mole" title="whack a mole icons">The Whack-A-Mole icon is created by Flat Icons - Flaticon</a>