A Swift port of Andrej Karpathy‘s llm.c. The C version was ported with the necessary changes due to differences in the ecosystems, mainly file I/O and parallelization, with the latter using Grand Central Dispatch instead of OpenMP.
-
Clone llm.c and follow instructions given there in README, section quick start (CPU). This will get you the dataset, the tokens, the small GPT-2 model (124M) released by OpenAI, and two executables for testing and training.
-
Clone this repository, cd into it, build and run the executables for testing and training.
git clone https://github.com/otabuzzman/llm.swift.git cd llm.swift # build for production xcodebuild -scheme llm.swift -configuration Release \ SWIFT_ACTIVE_COMPILATION_CONDITIONS="$SWIFT_ACTIVE_COMPILATION_CONDITIONS LLMSWIFT_STANDALONE" # usage: # test_gpt2 [ <llm.c folder> ] # train_gpt2 [ <llm.c folder> ] ./test_gpt2 ../llm.c # assuming llm.c in sibling folder ./train_gpt2 ../llm.c
The samples.md file provides the output of llm.swift captured from the first working version (without Metal) on a MacBook Air 2015. The output of llm.c (with OpenMP) is also provided for comparison.
Andrej Karpathy - (llm.c)
Copyright (c) 2024 Andrej Karpathy - MIT License
Metal implementation
James Thompson - (Metal implementation)
Copyright (c) 2024 James Thomson - MIT License
Adopted a concept shared by @regrettable-username in his Metal port llm.metal.