The T5 model was presented in Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer by Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu.
Model based on Transformers was used. The original code can be found here.
-
Speed on single NVIDIA-V100-16GB
BatchSize 64 128 transformers_v4.12.0 9.5 samples/s OOM above + fastseq 23.3 samples/s 31.7 samples/s
t5-base
from Huggingface Transformers model hub.
download with this command:
wget https://cdn-datasets.huggingface.co/translation/wmt_en_ro.tar.gz
tar -xzvf wmt_en_ro.tar.gz
export ENRO_DIR=${PWD}/wmt_en_ro
this should make a directory called wmt_en_ro/
with 6 files.
$ fastseq-generate-for-transformers \
t5-base \
wmt_en_ro/val.source \
out.summary \
--reference_path cnn_dm/val.target \
--device cuda \
--bs BATCH_SIZE \
--fp16 \
--score_path out.score \
--task translation_en_to_ro \
--postprocess_workers 3 \
--no_repeat_ngram_size 3
Baseline speed number is obtained by running Transformers v4.12.0 code.
Refer to file.