Awesome
A Strong Baseline for Query Efficient Attacks in a Black Box Setting
This repository contains source code for the research work described in our EMNLP 2021 paper:
A Strong Baseline for Query Efficient Attacks in a Black Box Setting
The attack jointly leverages attention mechanism and Locality-sensitive hashing (LSH) for word ranking. It has been implemented in the Textattack framework so as to ensure consistent comparison with other attack methods.
Installation
-
Clone the repository using the
recursive
flag so as to set up the Textattack submodule.git clone --recursive https://github.com/RishabhMaheshwary/query-attack.git
-
Make sure
git lfs
is installed in your system. If not installed refer this. -
Run the below commands to download pre-trained attention models.
git install lfs
git lfs pull
-
It is recommended to create a new conda environment to install all dependencies.
cd Textattack pip install -e . pip install allennlp==2.1.0 allennlp-models==2.1.0 pip install tensorflow pip install numpy==1.18.5
Running query-attack
-
To attack BERT model trained on IMDB using the WordNet search space use the following command:
textattack attack \ --recipe lsh-with-attention-wordnet \ --model bert-base-uncased-imdb \ --num-examples 500 \ --log-to-csv outputs/ \ --attention-model attention_models/yelp/han_model_yelp
Note: The attention model specified should be trained on a different dataset than that of the target model. This is because in the black box setting we do not have access to the training data of the target model.
-
To attack LSTM model trained on Yelp using the WordNet search space use the following command:
textattack attack \ --recipe lsh-with-attention-wordnet \ --model lstm-yelp \ --num-examples 500 \ --log-to-csv outputs/ \ --attention-model attention_models/imdb/han_model_imdb
-
To evaluate BERT model trained on MNLI using the HowNet search space use the following command:
textattack attack \ --recipe lsh-with-attention-hownet \ --model bert-base-uncased-mnli \ --num-examples 500 \ --log-to-csv outputs/ \ --attention-model mnli
The tables below shows what arguments to pass to --model
flag and --recipe
flag in the textattack command to attack BERT and LSTM models on IMDB, Yelp and MNLI datasets across various search spaces.
Model | --model flag |
---|---|
BERT-imdb | bert-base-uncased-imdb |
BERT-yelp | bert-base-uncased-yelp |
BERT-mnli | bert-base-uncased-mnli |
LSTM-imdb | lstm-imdb |
LSTM-yelp | lstm-yelp |
LSTM-mnli | lstm-mnli |
Search Space | --recipe flag |
---|---|
WordNet | lsh-with-attention-wordnet |
HowNet | lsh-with-attention-hownet |
Embedding | lsh-with-attention-embedding |
Embedding+LM | lsh-with-attention-embedding-gen |
To run the baselines in the paper refer to the main Textattack repository.
Training attention models
-
pip install gensim==3.8.3 torch==1.7.1+cu101
-
The datasets used to train the attention model can be found here.
-
Unzip the dataets and specify the path of the dataset in the
create_input_files.py
file. -
The model then can be trained using the command below:
python create_input_files.py
python train.py
python eval.py
The implementation of the training attention models is borrowed from here.
- For NLI task the attention weights are computed using the pre-trained decomposable attention model from AllenNLP api.