Home

Awesome

SPECTRE

Official implementation of SPECTRE: Defending Against Backdoor Attacks Using Robust Covariance Estimation.

Installation

Prerequisites

A requirements.txt is provided as a fallback for use with pip or Anaconda.

Installation

poetry install
julia --project=. -e "using Pkg; Pkg.instantiate()"

Running an experiment

Experiments are named using a specific convention:

{model}-{trainer}-{source_label}{target_label}-{m}x{attack_type}{eps_times_n}

Example: name=r32p-sgd-94-1xp500

The files related to experiment $name are stored in the directory output/$name.

Initial training

First we train a model on the poisoned dataset.

poetry run python train.py $name

This should save a PyTorch serialized model to output/$name/model.pth.

Compute hidden representations

Next we run the training data through the network and save the hidden representations to a file to be read later.

poetry run python rep_saver.py $name

This should save NumPy serialized arrays to output/$name/label_$label_reps.npy for $label from 0 to 9. Ususally, we are only interested in the file corresponding to the target label.

Run defences

We read the representations and execute the filters against them, producing three samples masks specifying which samples should be used for retraining.

julia --project=. run_filters.jl $name

This produces three files in output/$name/:

Retrain the networks on the cleaned datasets

poetry run python train.py $name $mask_name

This reads the mask from output/$name/$mask_name.npy and trains the network from scratch on the resulting masked dataset.

Running against other attacks

For attacks not implemented here, you will need to find a way to obtain the hidden representations of the network in npy format. You can then put it in a directory under output with an arbitrary name as long as it ends in {eps_times_n}, which is needed by to determine how many samples to remove. You can then pass that name to run_filters.jl.