Releases: minimaxir/gpt-2-simple
Releases · minimaxir/gpt-2-simple
load_gpt2() improvements
load_gpt2()
in a fresh session is much faster and uses much less memory when loaded. (for the 117M model, the system will stay under <2 GB RAM which is the critical point for cloud services)start_tf_sess()
now accepts athreads
parameter, which is useful if you know exactly how many threads will be used.
Fix CSV Finetuning
Number of CSV tokens was inadvertently doubled. (#25)
345M model support
- Support the 345M model (thanks to Neil Shepperd for the gradient checkpointing implementation!)
- Support model_name in the CLI for above support
- Support run_name in the CLI
- Support
.csv
files as an input dataset tofinetune
(will parse the CSV as if it was done viaencode_csv()
). - Fix one off issues (#21)
Better restore and checkpoint behavior
CLI
More utility functions
is_gpt2_downloaded
: Check if the model is downloaded.encode_csv
: Convert a CSV to a format suitable for GPT-2.
Initial release
v0.1 README + setup.py cleanup