Skip to content

J port of "An unnecessarily tiny implementation of GPT-2 in NumPy."

License

Notifications You must be signed in to change notification settings

NPN/picoGPT-in-j

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

picoGPT in J

J port of jaymody/picoGPT, "An unnecessarily tiny implementation of GPT-2 in NumPy."

Install

Download J (tested with J903 and base library 9.03.08, see the wiki for installation instructions). You will also need the convert/pjson addon:

NB. If nothing is printed, the addon is already installed
install 'convert/pjson'

Download models

Get the GPT-2 models you want by running the download.sh script (e.g. ./download.sh 124M). If you can't run it, first make a models/ directory and download the tokenizer files into it:

Then, download the model.safetensors and config.json for the model you want and place them in the corresponding models/[model size] directory (e.g. models/124M):

Usage

Run gpt2.ijs (e.g. jconsole gpt2.ijs). Then:

NB. Load model
model '124M'
NB. Generate 40 tokens by default
gen 'Alan Turing theorized that computers would one day become'

NB. Switch model
model '1558M'
NB. Generate 79 tokens. Assign the output to a variable to prevent it from
NB. being printed to the console twice.
out =. 79 gen 'The importance of nomenclature, notation, and language as tools of'

Notes

  • When the input length exceeds n_ctx, rather than throwing an exception, only the last n_ctx tokens are used.
  • Instead of a progress bar, tokens are printed as they're generated.
  • All calculations are done with 64-bit floats since J doesn't have 32-bit floats (not sure about 32-bit J, though).
  • The Safetensors format is used since it's easier to parse. This means checkpoints are downloaded from HuggingFace rather than OpenAI's Azure storage. Filenames are also different:
    • model.ckpt.* -> model.safetensors
    • hparams.json -> config.json
    • encoder.json -> vocab.json
    • vocab.bpe -> merges.txt
  • Thanks to karpathy/minGPT for having a good explanation of the BPE tokenizer.

About

J port of "An unnecessarily tiny implementation of GPT-2 in NumPy."

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published