Awesome
Code for AAAI2021 paper "How Does Data Augmentation Affect Privacy in Machine Learning?"
Dependency
This code is tested with torch 1.5 and numpy 1.14.
We use benchmark datasets CIFAR10 and CIFAR100. The program will download the dataset automatically at the first run.
Training target model
We use random seed to generate individual transformation. Each integer seed corresponds to a different transformation. Therefore, the random seeds chosen during training can be used to re-generate augmented instances that the model is trained on.
The following command trains a ResNet110 model with 10 augmented instances for each image.
CUDA_VISIBLE_DEVICES=0 python cifar_train.py --arch resnet110 --aug_instances 10 --sess resnet110_N10
After standard training procedure, we record the outputs of trained model (loss and logits) in the results folder.
You can also train with WRN16-8 and 2-layer ConvNet. The commands are listed below.
CUDA_VISIBLE_DEVICES=0 python cifar_train.py --arch wrn16_8 --aug_instances 10 --weight_decay 5e-4 --sess wrn16_8_N10
CUDA_VISIBLE_DEVICES=0 python cifar_train.py --arch convnet --aug_instances 10 --trainset_size 15000 --batchsize 256 --lr 0.01 --weight_decay 0. --sess smallconv_N10
Set aug_instance 0
will train the target model without data augmentation. You can train above models on CIFAR100 by adding --c100
flag.
Evaluating membership inference algorithms
We implement five MI algorithms in mi_attack.py. You can evaluate them simultaneously with a given session name.
python mi_attack.py --sess resnet110_N10 --aug_instances 10
python mi_attack.py --sess wrn16_8_c100_N10 --aug_instances 10 --c100
The --random_t
flag allows one to evalute our algorithms with augmented data which is not used in training.
python mi_attack.py --sess resnet110_N10 --aug_instances 10 --random_t
Citation
@inproceedings{yu2021how,
title={How Does Data Augmentation Affect Privacy in Machine Learning?},
author={Yu, Da and Zhang, Huishuai and Chen, Wei and Yin, Jian and Liu, Tie-Yan},
year = {2021},
booktitle = {Proc. of the AAAI Conference on Artificial Intelligence}
}