-
Notifications
You must be signed in to change notification settings - Fork 545
Issues: karpathy/build-nanogpt
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Why does weights sharing have to be in that direction?
#89
opened Jan 16, 2025 by
BetterWang
updated Jan 16, 2025
How Can I extract Last Layer Representation?
#86
opened Oct 31, 2024 by
shantanu778
updated Oct 31, 2024
Avoid tiktoken.decode panic on unknown tokens.
#81
opened Aug 26, 2024 by
IggShaman
updated Aug 26, 2024
torch.compile-d models do not work with example generation and hellaswag eval
#79
opened Aug 26, 2024 by
IggShaman
updated Aug 26, 2024
Cannot get the log file "log124M_40B/log.txt"?
#47
opened Jul 1, 2024 by
dtdo90
updated Jul 19, 2024
Text generation can use raw_model instead of model
#56
opened Jul 10, 2024 by
sapphire008
updated Jul 10, 2024
Running codes on Windows issues
#45
opened Jun 30, 2024 by
gerardaristizabalpla4
updated Jul 9, 2024
Sharding the dataset not completing?
#25
opened Jun 14, 2024 by
dustinwloring1988
updated Jul 9, 2024
How to support padding in the train dataset for training ?
#49
opened Jul 2, 2024 by
mrhimanshu
updated Jul 9, 2024
Integrating GPT-2 with deepspeed Zero-1, Zero-2 and Zero-3
#48
opened Jul 1, 2024 by
Devadeut
updated Jul 8, 2024
Different inference results between flash attention and manually implemented attention appeared.
#50
opened Jul 2, 2024 by
Jaeckel-d
updated Jul 2, 2024
Consider using
torch.compile(model, fullgraph=True, mode="reduce-overhead")
#6
opened Jun 11, 2024 by
lezcano
updated Jun 18, 2024
Embeddings are initialized with std of 0.02
#18
opened Jun 12, 2024 by
eryk-mazus
updated Jun 13, 2024
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.