The aim is to perform style transfer task on text. Here we use the yelp review dataset which can be found in data folder. The dataset consits of negative and positive reviews. We aim to transfer the style of a review from positive to negative and vice versa
pip3 install -r requirements.txt
This code uses python 3.
For Training Negative to Positive Style Transfer run:
python3 train.py --config yelp_config.json --bleu
For Training Positive to Negative Style Transfer run:
python3 train.py --config yelp_config2.json --bleu
This will reproduce the model on a dataset of yelp reviews:
Checkpoints, logs, model outputs, and TensorBoard summaries are written in the config's working_dir
.
See yelp_config.json
for all of the training options.
For Negative to Positive
python inference.py --config yelp_config.json --checkpoint path/to/model.ckpt
For Negative to Positive
python inference.py --config yelp_config2.json --checkpoint path/to/model.ckpt
To run inference, you can point the src_test
and tgt_test
fields in your config to new data.
Given two pre-tokenized corpus files, use the scripts in tools/
to generate a vocabulary and attribute vocabulary:
python tools/make_vocab.py [entire corpus file (src + tgt cat'd)] [vocab size] > vocab.txt
python tools/make_attribute_vocab.py vocab.txt [corpus src file] [corpus tgt file] [salience ratio] > attribute_vocab.txt
python tools/make_ngram_attribute_vocab.py vocab.txt [corpus src file] [corpus tgt file] [salience ratio] > attribute_vocab.txt