

From Hero to Zéroe: A Benchmark of Low-Level Adversarial Attacks

This repository provides code and data for the paper From Hero to Zéroe: A Benchmark of Low-Level Adversarial Attacks


If you use the Zéroe benchmark, please use the latest version available here. References for the benchmark are:

Tasks (Datasets):


Batch Size28
Sequence Length256


RoBERTas' performance attacked by Zéroe

Adversarial Training

Adversarial Training (leave-one-out)

1. Requirements

We use conda to setup our python environment.

We freezed our environment into the environment.yml file (further docs).

Restore it with the following command:

conda env create -f environment.yml

The fact that some packages are not available in the conda repository makes it necessary to install them manually:

pip install transformers==2.5.1 pip install seqeval==0.0.12

The full requirements are given in the requirements.txt You can also install them via: pip install -r requirements.txt

conda install numpy pandas scitkit-learn nltk torch fastprogress absl tqdm
conda install -c fastai fastprogress
conda install tensorflow-gpu==2.0.0  (if GPU is available else: `tensorflow==2.0.0`)  
pip install transformers==2.5.1
pip install seqeval==0.0.12

2. code/models

contains the models being used in this work


g2pp2g.py contains the model(s) to generate the phonetic perturbations. Pretrained models used to generate the phonetic perturbations can be found in models/g2p and models/p2g. These pretrained models are automatically preloaded if the TRAIN flags aren't specified. Therefore to retrain the models you need to enable those flags in the source code.

3. data + Attacks

In order to perturb the data we preprocessed each dataset by all our 10 perturbers and stored them to data/task/{mode}_{perturber}_{level}.txt, e.g. data/datasets/tc/train_phonetic_high.txt This naming scheme is important so run the experiments seamlessly.

To generate this data run:

python gen_datasets.py 
--task {task}
--methods {attackers}
--level {attack level}
--indir {path_to_raw_data}

e.g. to generate the perturbed data for SNLI with all attackers on perturbation level low run:

python gen_datasets.py 
--task snli
--methods all
--level low
--indir ./data

4. Run roberta train/eval/predict (experiments)

The following describes how to train/evaluate/predict RoBERTa This behavior is the same for all three tasks, you just need to replace the run_task.py file

For detailed description about the command line flags consult the respective python file (e.g. run_tc.py).


Defense Mechanisms

Adversarial Training (e.g. with full-swap)

python run_tc.py  

Adversarial Training Leave-One-Out (e.g. with full-swap)

python run_tc.py  