Awesome
Backdoor Attacks on CIFAR10
This is a simple Resnet for CIFAR implementation supporting the detection of backdoor attacks with spectral signatures.
train.py
trainseval.py
evals once (--loop
flag for infinite loop)config.json
has all the optionscompute_corr
performs the spectral signature detection
The simplest usage is to change the "data" section of the config.json file.
poison_method
can bepixel
,pattern
, orell
poison_eps
refers to how many corrupted images are addedclean_label
refers to the class of images to which a mark is addedtarget_label
refers to the label assigned to corrupted imagesposition
andcolor
are the parameters for the backdoor triggerpercentile
represents how many images to keep
The compute_corr.py
file will load the latest checkpoint from the given output
directory in the config file and perform the spectral signature detection.
The code will print (to stdout) the top singular values with and without
the corrupted inputs, as well as the number of corrupted images removed as
having a high score.
The model directory is then updated with a numpy file containing the indices
of the top scores, so that if the train file was run again, the model would
be trained without training inputs corresponding to the removed indices.