-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training script #31
Merged
+2,033
−38
Merged
Training script #31
Changes from 1 commit
Commits
Show all changes
150 commits
Select commit
Hold shift + click to select a range
5aa939a
add llama2.c submodule
jannik-brinkmann 27e7df4
rename submodule to avoid import errors
jannik-brinkmann f9dbff2
add llama2.c wrapper
jannik-brinkmann 1b394e0
draft training.py
jannik-brinkmann 9fba402
updated draft
jannik-brinkmann e11de88
Adding a Mamba Class
SrGonao 6287121
Moving stuff to the correct place
SrGonao e42a8db
Not ready but idea there
SrGonao 6642256
updated training script
jannik-brinkmann b9cdc78
formatting
jettjaniak 294e792
remove gitmodules
jettjaniak c95c28b
moved llama2c submodule
jettjaniak 27891b5
llama2c update
jettjaniak cd2c5f7
fix import
jettjaniak 145f8aa
remove unused files
jettjaniak fe834b0
rename training_old -> training
jettjaniak f7eacd7
Moved Mamba
SrGonao a88be1f
Added type hinting
SrGonao 0d356af
Removed not needed file
SrGonao 646b3e7
Removed compile, amp train fp32
SrGonao b04975a
fixing black and isort
SrGonao 2b056ca
add submodules to checkout in CI
jettjaniak 23f8a55
pyproject.toml, moved isort cfg, excl. llama2c
jettjaniak f7cc6b7
isort: llama2c known_third_party
jettjaniak e863604
limit pytest to tests/ directory
jettjaniak a00425b
Training_script_refactor (#54)
jaidhyani 2aa2ea6
It's actually a script now
jaidhyani ce4a6ca
lol copypasting
jaidhyani 83496e2
cleanup
jaidhyani 30a1b30
Adding support for config files
jaidhyani 4098036
comments
jaidhyani 96f2361
flag arguments take priority over config file values
jaidhyani 584c55d
comments
jaidhyani d79d50c
gitignore .DS_Store file on macos
jaidhyani 9e1e9d8
remove training.sh
jaidhyani 27f5d43
meeting notes and tweaks
jaidhyani e542237
configurable device
jaidhyani 0524030
Adding mamba implementation
SrGonao 29e986e
mamba hacks, please forgive me
jaidhyani d8831bd
experimenting with cuda support in gh actions
jaidhyani 95d534d
welp, that didn't work
jaidhyani 8716bf7
remove tokenized_chunks_dataset
jaidhyani 7352eec
separate batch ordering and torch seeds
jaidhyani c3e5ef7
remove mamba.py
jaidhyani 7cb4ca7
refactoring
jaidhyani 7213aa4
rm TODO
jaidhyani a8f7143
refactoring
jaidhyani e038f31
bughunt
jaidhyani 59fce94
debugger config
jaidhyani 734b92e
typing improvements and bugfixes
jaidhyani 2ed386c
add support for "x.y.z = val" style config
jaidhyani c928d3b
first steps towards Llama2HF support
jaidhyani 54b095a
more debugging stuff
jaidhyani fee0497
initial HF llama2 support
jaidhyani 2665633
debug more
jaidhyani 2f65438
Add support for preset model configs in script, specifying multiple c…
jaidhyani 4c64774
bughunt
jaidhyani a9ac3dd
fix beartype Callalble deprecation warning
jaidhyani 3d7711a
rm llamaconfig json accidentally added before
jaidhyani c4e69d2
asdf
jaidhyani 656228c
script tweaks
jaidhyani 27fdc79
better gigaconfig defaults
jaidhyani 398f1de
debug config is now just another preset; better documentation for sub…
jaidhyani 366b4b5
fix imports
jaidhyani d4a81e8
remove upload_tokens
jaidhyani 5dc23e6
Whoops. I should probably test things more before pushing them.
jaidhyani 2f1a0a4
cleanup
jaidhyani 1f37228
script tweaks
jaidhyani adfd4b4
added support for prioritizing configs
jaidhyani e3b326c
refactoring (config_utils) to support notebook use
jaidhyani 551a8de
fix Llama2ConfigData bug in gigaconfig (use default_factory)
jaidhyani cd9a5b1
make run_training return ModelTrainingState
jaidhyani ab19879
more config_utils
jaidhyani bc8b43d
cleanup run_training script
jaidhyani 859ae09
training_demo notebook (for colab)
jaidhyani fc3f021
static files tweak
jaidhyani 70e82ee
estimate_mfu for llama2hf
jaidhyani 697f729
Don't break if model export not available
jaidhyani a8f7a4f
100k quick config
jaidhyani 997ec3a
torch.use_deterministic_algorithms for training
jaidhyani f9bd899
import Callable from collections.abc
jaidhyani 89cee7c
Move up torch.manual_seed before calling anything in torch
jaidhyani 698365f
add wandb to requirements
jaidhyani 7f6c180
factor out training config package + wandb_config
jaidhyani 662555d
unused import
jaidhyani 0699026
isort
jaidhyani 4007b6a
initial mamba support
jaidhyani 594033e
pip install wheel
jaidhyani cc010da
pip install packaging
jaidhyani bfb28c1
come on, mamba_ssm, get it together
jaidhyani f85d015
requirements-nocuda.txt for gh actions
jaidhyani 3c0010b
Merge branch 'main' into training-script
jaidhyani 4870d92
mv ModelTypes to constants
jaidhyani 9eeb960
deprecate llama2c support
jaidhyani 02ec1a1
clear out more llama2c stuff
jaidhyani 6cb9d52
we still need max_seq_len
jaidhyani 35cb7c4
factoring out optimizer params from config
jaidhyani 5e10db2
fix broken test
jaidhyani af6c0db
model_args overhaul
jaidhyani 2a9d2c2
rm llama2c
jaidhyani 1225d93
replace DataLoader
jaidhyani a9a791b
run_dir to gigaconfig; output_run_dir; fix Generator type warning
jaidhyani 10b1a36
save results when training is done
jaidhyani fbfeaa5
save step in results
jaidhyani 5a54078
include architecture and priority in llama preset configs
jaidhyani 19e779f
Merge branch 'training-script' into mamba_dev
jaidhyani a98aa42
update training demo
jaidhyani 53a6adf
mamba expectedly imports correctly
jaidhyani 0d69ede
rm export_model
jaidhyani e24fec4
estimate_loss no longer depends on architecture
jaidhyani b9f682c
add combine_configs (working towards frozen config for type safety)
jaidhyani 13db6eb
renaming/simplification
jaidhyani 15983a3
model_config refactor to approach type safety + frozen dataclasses
jaidhyani ac8fb6a
rm architectures.py
jaidhyani 43dd1b2
new config system with type safety!
jaidhyani 6a21f2c
Support for optional config types (mamba and llama)
jaidhyani be2354f
fix sample configs
jaidhyani c128f96
remove some unused model config args
jaidhyani 28076ed
remove unused mamba.py
jaidhyani 384b140
I thought I already deleted this?
jaidhyani 5f64802
rename to "initialize_model_training_state"
jaidhyani e4ecd21
Support for mandatory fields in run_training
jaidhyani a1a150d
ModelTypes
jaidhyani ec04579
output_dir is output_dir
jaidhyani fb0c509
cleaner imports
jaidhyani 129587d
error if ModelConfig doesn't include config for chosen model type
jaidhyani ab208aa
no-op renames for clarity
jaidhyani 076cd8b
log levels
jaidhyani be352fe
shebang & chmod +x on run_training.py
jettjaniak 10d718e
renamed corpus dataset
jettjaniak 77c15ef
removed llama2c references from pyproject.toml
jettjaniak a8d03a8
removed .gitmodules
jettjaniak 2d86063
removed scripts/upload_stories.py
jettjaniak bed480c
test wandb_utils
jaidhyani f168310
no llama2c, no .view, no need for enforcing contigious tensors
jaidhyani 6a67a02
Fix _unoptionalize
jaidhyani 52745bc
run_training.py --help when no args
jaidhyani 9e759a3
script improvements: model-specific args moved to their own --help; f…
jaidhyani 5dfdb94
rename llama to llama2
jaidhyani 12dc6c4
unused imports
jaidhyani a81f9e5
set run name from config file
jaidhyani 63cb45b
set default output_dir based on run_name
jaidhyani be3069d
remove in-progress testing file added by mistake
jaidhyani d15f467
add huggingface config
jaidhyani 9c27671
fix config json that got broken somehow
jaidhyani c259e97
save/load fix + huggingface uploading
jaidhyani 0a04b99
fix test that broken when renaming llama to llama2
jaidhyani 083cb1b
unused import
jaidhyani 580d3c6
fix validation sampling
jaidhyani 14dc55e
remove eval_only
jaidhyani File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
rename training_old -> training
commit fe834b03e72f588bef0203f045e45bf0a620b451
There are no files selected for viewing
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main training function! Given a GigaConfig, does setup and runs the training loop. Most of the actual logic lives in train_step.