Skip to content

345M model support

Compare
Choose a tag to compare
@minimaxir minimaxir released this 05 May 05:39
· 104 commits to master since this release
583fdb0
  • Support the 345M model (thanks to Neil Shepperd for the gradient checkpointing implementation!)
  • Support model_name in the CLI for above support
  • Support run_name in the CLI
  • Support .csv files as an input dataset to finetune (will parse the CSV as if it was done via encode_csv()).
  • Fix one off issues (#21)