Skip to content

Latest commit

 

History

History
77 lines (62 loc) · 2.16 KB

README.md

File metadata and controls

77 lines (62 loc) · 2.16 KB

Any2Some

This is a framework that aims to simplify the development of neural, end-to-end, data-to-text systems.

Supported Models

The framework supported the following models:

Monolingual

  1. BERT
  2. BART
  3. T5
  4. GPT-2
  5. Blenderbot

Multilingual

  1. mBERT and BERTimbau
  2. mBART-50
  3. mT5
  4. GPorTuguese

Usage

Training

if [ ! -d "env" ];
then
  virtualenv env
  . env/bin/activate
  pip3 install -r requirements.txt
else
  . env/bin/activate
fi

python3 train.py --tokenizer facebook/bart-large \
                --model facebook/bart-large \
                --src_train 'example/trainsrc.txt' \
                --trg_train 'example/traintrg.txt' \
                --src_dev 'example/devsrc.txt' \
                --trg_dev 'example/devtrg.txt' \
                --epochs 3 \
                --learning_rate 1e-5 \
                --batch_size 8 \
                --early_stop 2 \
                --max_length 180 \
                --write_path bart \
                --language portuguese \
                --verbose \
                --batch_status 16 \
                --cuda

Evaluation

. env/bin/activate

if [ ! -d "results" ];
then
  mkdir results
fi

python3 evaluate.py --tokenizer facebook/bart-large \
                --model bart/model \
                --src_test 'example/testsrc.txt' \
                --trg_test 'example/testtrg.txt' \
                --batch_size 4 \
                --max_length 180 \
                --write_dir results \
                --language portuguese \
                --verbose \
                --batch_status 16 \
                --cuda