Awesome
Charmer: Revisiting Character-level Adversarial Attacks for Language Models
Official implementation in PyTorch of the ICML'24 paper "Revisiting Character-level Adversarial Attacks for Language Models".
Requirements
conda create -n lmrobust python=3.8
conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.7 -c pytorch -c nvidia
conda install -c anaconda jupyter
pip install -r requirements.txt
baseline
includes auxiliary code not written by authors.
Sample scripts
To run Charmer in the TextAattack BERT, RoBERTa and ALBERT models, run the separate dataset scripts in scripts
, e.g.:
cd scripts
bash run_sst.sh
This will produce a .csv
file similar to:
original,perturbed,True,Pred_original,Pred_perturbed,success,Dist_char,Dist_token,similarity,time
Fears for T N pension after talks Unions representing workers at Turner Newall say they are 'disappointed' after talks with stricken parent firm Federal Mogul.,Fears for T E pension :fter talks Unions representing workers at Turner Newall say they are 'disappointed' after talks with stricken parent firm Federal Mogul.,2,2,3,True,2,4,0.9770777225494384,3.747948408126831
where each column is:
original
: Original sentence.perturbed
: Sentence after the attack.True
: True label of the sentence.Pred_original
: Prediction of the model for the original sentence.Pred_perturbed
: Prediction of the model for the attacked sentence.success
: True ifPred_original
!=Pred_perturbed
.Dist_char
: Levenshtein distance in the character-level betweenoriginal
andperturbed
.Dist_token
: Levenshtein distance in the token-level betweenoriginal
andperturbed
.similarity
: USE betweenoriginal
andperturbed
.time
: Time needed to produceperturbed
.
Cite as:
@inproceedings{Abad2024Charmer,
author = {Abad Rocamora, Elias and Wu, Yongtao and Liu, Fanghui and Chrysos, Grigorios G and Cevher, Volkan},
title = {Revisiting Character-level Adversarial Attacks for Language Models},
booktitle = {International Conference on Machine Learning (ICML)},
year = {2024}
}