This repository contains code to reproduce results from the paper:
Detecting Textual Adversarial Examples through Randomized Substitution and Vote (UAI 2022)
Xiaosen Wang, Yifeng Xiong, Kun He
There are three datasets used in our experiments. Download and put the dataset into the directory ./data/ag_news
, ./data/imdb
and ./data/yahoo_answers
, respectively.
There are Three dependencies for this project. Download and put the files glove.840B.300d.txt
and counter-fitted-vectors.txt
into the directory ./data/vectors
, put the directory stanford-postagger-2018-10-16/
into the directory ./data/aux_files
.
You can run the get_data_and_dependencies.sh
to get test data:
bash get_data_and_dependencies.sh
-
./model
: Detail code for model architecture. -
./utils
: Helper functions for training models and processing data. -
./adversary
: Files for attack methods. -
./data
: Datasets and GloVe vectors. -
cnn_classifier.py
,bert_classifier.py
,robert_classifier.py
: Training code for CNN, bert and RoBERTa. -
cnn_attack.py
: Attacking CNN model. -
bert_attack.py
Attacking BERT and RoBERTa model. -
build_embs.py
: Generating the dictionary, embedding matrix and distance matrix. -
synonym_selector.py
: Generating synonyms set. -
detect_transfer.py
: Converting adversarial examples through Randomized Substitution. -
detect_eval.py
: Vote and Detection.
-
Generating the dictionary, embedding matrix and distance matrix:
python build_embs.py --data_dir ./data/ --task_name ag_news
-
Training and attacking the models:
For CNN:
python cnn_classifier.py --output_dir ./output/model_file/ag_news/cnn --data_dir ./data/ --task_name ag_news --max_seq_length 128 --do_train --do_eval --vGPU 0 python cnn_attack.py --output_dir ./output/model_file/ag_news/cnn --data_dir ./data/ --attack textfooler --task_name ag_news --max_seq_length 128 --max_candidate 50 --save_to_file ./output/adv_example/ag_news_cnn_textfooler --vGPU 0
For BERT:
python bert_classifier.py --output_dir ./output/model_file/ag_news/bert --bert_model bert-base-uncased --data_dir ./data/ --task_name ag_news --max_seq_length 128 --do_train --do_eval --vGPU 0 python bert_attack.py --data_dir ./data/ --task_name ag_news --attack textfooler --output_dir ./output/model_file/ag_news/bert/ --attack_batch 1000 --save_to_file ./output/adv_example/ag_news --bert_model bert-base-uncased --max_candidate 50 --max_seq_length 128 --vGPU 0
-
Evaluating the detection performance:
python detect_transfer.py --task_name ag_news --data_dir ./data/ --votenum 25 --randomrate 0.6 --fixrate 0.02 --advfile ./output/adv_example/ag_news_cnn_textfooler.pkl --out_file ./output/transfer/ag_news_cnn_textfooler.pkl
python detect_eval.py --task_name ag_news --data_dir ./data/ --max_seq_length 128 --modeltype cnn --output_dir ./output/model_file/ag_news/cnn --eval_file ./output/transfer/ag_news_cnn_textfooler.pkl
Questions and suggestions can be sent to [email protected].