Awesome

Image Obfuscation Benchmark

This repository contains the code to evaluate models on the image obfuscation benchmark, first presented in Benchmarking Robustness to Adversarial Image Obfuscations (Stimberg et al., 2023).

Dataset

The dataset consists of 22 obfuscations and the Clean data. 19 obfuscations are training obfuscations and 3 are hold-out obfuscations. All images are central cropped to 224 x 224 and saved as compressed JPEG images. Each obfuscation is applied to each image in the ILSVRC2012 dataset. For each image, the file_name, label and obfuscation hyper-parameters are stored with it. The dataset can be loaded through the TensorFlow datasets API. Each combination of train / validation and an obfuscation is its own split, e.g. to load the validation split obfuscated with the StyleTransfer obfuscation do

import tensorflow_datasets as tfds

ds = tfds.load('obfuscated_imagenet', split='validation_StyleTransfer', data_dir='/path/to/extracted/dataset/')

where the splits must be present in the /path/to/extracted/dataset/obfuscated_imagenet/1.0.0 directory.

To load multiple obfuscations together, e.g. for training use the sample_from_datasets function.

Obfuscation Examples

Clean	AdversarialPatches	BackgroundBlurComposition

ColorNoiseBlocks	ColorPatternOverlay	Halftoning

HighContrastBorder	IconOverlay	ImageOverlay

Interleave	InvertLines	LineShift

LowContrastTriangles	PerspectiveComposition	PerspectiveTransform

PhotoComposition	RotateBlocks	RotateImage

StyleTransfer	SwirlWarp	TextOverlay

Texturize	WavyColorWarp

Download {#dataset-download}

You can download the validation and train splits for all the obfuscations below. If you want to load them with the Tensorflow datasets API as described above you also need to download these two JSON files: dataset_info.json, features.json.

Obfuscation	Validation	Train
Clean	tar (1.2 GB)	tar (31 GB)
AdversarialPatches	tar (1.4 GB)	tar ( 36 GB)
BackgroundBlurComposition	tar ( 0.5 GB)	tar (12 GB)
ColorNoiseBlocks	tar (1.9 GB)	tar (48 GB)
ColorPatternOverlay	tar (1.8 GB)	tar (45 GB)
Halftoning	tar (2.4 GB)	tar (54 GB)
HighContrastBorder	tar (2.1 GB)	tar (55 GB)
IconOverlay	tar (1.7 GB)	tar (43 GB)
ImageOverlay	tar (1.2 GB)	tar (30 GB)
Interleave	tar (1.5 GB)	tar (38 GB)
InvertLines	tar (1.4 GB)	tar (35 GB)
LineShift	tar (1.5 GB)	tar (37 GB)
LowContrastTriangles	tar (0.9 GB)	tar (23 GB)
PerspectiveComposition	tar (1.1 GB)	tar (29 GB)
PerspectiveTransform	tar (0.4 GB)	tar (9.5 GB)
PhotoComposition	tar (1.2 GB)	tar (31 GB)
RotateBlocks	tar (1.5 GB)	tar (37 GB)
RotateImage	tar (1.0 GB)	tar (24 GB)
StyleTransfer	tar (1.3 GB)	tar (34 GB)
SwirlWarp	tar (1.2 GB)	tar (29 GB)
TextOverlay	tar (2.1 GB)	tar (55 GB)
Texturize	tar (1.3 GB)	tar (34 GB)
WavyColorWarp	tar (1.3 GB)	tar (33 GB)

Usage Instructions

Installing

Download the eval dataset and extract it to a folder.

Clone this repository.

git clone https://github.com/google-deepmind/image_obfuscation_benchmark.git

Execute run.sh to create and activate a virtualenv, install all necessary dependencies and run a test program to ensure that you can import all the modules.

cd image_obfuscation_benchmark
sh image_obfuscation_benchmark/run.sh

Evaluating a model

source /tmp/image_obfuscation_benchmark/image_obfuscation_benchmark/bin/activate

and then run

python3 -m image_obfuscation_benchmark.eval.predict \
--dataset_path=/path/to/the/downloaded/dataset/ \
--model_path=https://tfhub.dev/google/imagenet/resnet_v2_50/classification/1 \
--evaluate_obfuscation=Clean \
--normalization=zero_one \
--output_dir=/tmp/

Which will write predictions to /tmp/Clean.csv. This has to be done for all obfuscations. Afterwards you run

python3 -m image_obfuscation_benchmark.eval.gather_results \
--output_dir=/tmp/

which will load all the predictions, calculate the metrics and save them to /tmp/metrics.csv.

Training a model

We do not supply code to train models on the dataset at the moment but it can be easily loaded with tensorflow_datasets into any pipeline.

Ethical Considerations

The specific obfuscations that we use in our benchmark may have the potential to fool automatic filters and therefore increase the amount of harmful content on digital platforms. To reduce this risk, we decided against releasing the code to create the obfuscations systematically and instead only releasing the precomputed dataset and code to evaluate on it.

Citing this work

If you use this code (or any derived code) in your work, please cite the accompanying paper:

@misc{stimberg2023benchmarking,
      title={Benchmarking Robustness to Adversarial Image Obfuscations},
      author={Florian Stimberg and Ayan Chakrabarti and Chun-Ta Lu and Hussein Hazimeh and Otilia Stretcu and Wei Qiao and Yintao Liu and Merve Kaya and Cyrus Rashtchian and Ariel Fuxman and Mehmet Tek and Sven Gowal},
      year={2023},
      eprint={2301.12993},
      archivePrefix={arXiv},
}

License and Disclaimer

All software is licensed under the Apache License, Version 2.0 (Apache 2.0); you may not use this file except in compliance with the License. You may obtain a copy of the Apache 2.0 license at

https://www.apache.org/licenses/LICENSE-2.0

All non-code materials are licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). You may obtain a copy of the CC BY-NC License at:

https://creativecommons.org/licenses/by-nc/4.0/legalcode

You may not use the non-code portions of this file except in compliance with the CC BY-NC License.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

This is not an official Google product.