Skip to content

Releases: minimaxir/gpt-2-simple

load_gpt2() improvements

05 May 22:51
Compare
Choose a tag to compare
  • load_gpt2() in a fresh session is much faster and uses much less memory when loaded. (for the 117M model, the system will stay under <2 GB RAM which is the critical point for cloud services)
  • start_tf_sess() now accepts a threads parameter, which is useful if you know exactly how many threads will be used.

Fix CSV Finetuning

05 May 16:35
Compare
Choose a tag to compare

Number of CSV tokens was inadvertently doubled. (#25)

345M model support

05 May 05:39
583fdb0
Compare
Choose a tag to compare
  • Support the 345M model (thanks to Neil Shepperd for the gradient checkpointing implementation!)
  • Support model_name in the CLI for above support
  • Support run_name in the CLI
  • Support .csv files as an input dataset to finetune (will parse the CSV as if it was done via encode_csv()).
  • Fix one off issues (#21)

Better restore and checkpoint behavior

23 Apr 03:36
Compare
Choose a tag to compare
  • Fix one-off error where checkpoint saved a step early.
  • Fix issue where restore_from='fresh uses the counter from a previously-trained checkpoint.
  • If restore_from='latest , steps will now train for the specified amount of steps, instead of the training until the specified number of steps. (#13, #14)

CLI

21 Apr 17:20
cd257bb
Compare
Choose a tag to compare
CLI
  • Added a basic CLI.
  • Added a include_prefix parameter to give an option to exclude the input prefix.
  • Improved regex for truncation.

More utility functions

20 Apr 17:43
ca5c4ac
Compare
Choose a tag to compare
  • is_gpt2_downloaded: Check if the model is downloaded.
  • encode_csv: Convert a CSV to a format suitable for GPT-2.

Initial release

19 Apr 00:19
Compare
Choose a tag to compare
v0.1

README + setup.py cleanup