-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
54 changed files
with
5,165 additions
and
99 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
name: CI | ||
|
||
on: | ||
pull_request: | ||
branches: | ||
- main | ||
push: | ||
branches: | ||
- main | ||
jobs: | ||
check: | ||
name: Lint and check types | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v4 | ||
- name: Install uv | ||
uses: astral-sh/setup-uv@v4 | ||
- name: 'Set up Python' | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version-file: '.python-version' | ||
- name: Lint | ||
run: make lint | ||
- name: Check formatting | ||
run: uv run ruff format src tests --check | ||
- name: Check types | ||
run: make check_types | ||
|
||
test: | ||
name: Test Python ${{ matrix.python-version }} | ||
runs-on: ubuntu-latest | ||
strategy: | ||
matrix: | ||
python-version: | ||
- '3.10' | ||
- '3.11' | ||
- '3.12' | ||
steps: | ||
- uses: actions/checkout@v4 | ||
- name: Install uv and set the python version | ||
uses: astral-sh/setup-uv@v4 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
- name: Set up Python | ||
run: uv python install | ||
- name: Install the project | ||
run: uv python install | ||
- name: Run unit tests | ||
run: make test_unit | ||
- name: Run integration tests | ||
run: make test_integration | ||
- name: Run e2e tests | ||
run: make test_e2e |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
io: | ||
name_model: 8k_7mi_360m_1d_s_pt1_ft | ||
output_dir: /data/output < set this! | ||
dataset_dir: /data/datasets/clevr < set this! | ||
dataset_id: clevr | ||
dataset_args: | ||
target_mode: a | ||
qiqa_loss_mask: [0.0, 0.0, 0.0, 1] | ||
answer_categorical: true | ||
resize_to_w: 62 | ||
resize_to_h: 41 | ||
crop_h_perc: 0.1 | ||
crop_w_perc: 0.1 | ||
eom_token_id: 129 | ||
som_text_token_id: 130 | ||
som_image_token_id: 131 | ||
downsample_channels: null | ||
shift_channels_start: null | ||
num_models_to_save: 5 | ||
validate_amount: 100 | ||
log_train_loss_amount: 1000 | ||
description: >- | ||
Describe this! | ||
params: | ||
num_tokens: 256 | ||
pad_token_id: 128 | ||
input_seq_len: 8192 | ||
seq_lens: [8192] | ||
hidden_dims: [1024] | ||
num_layers: [54] | ||
train_checkpoint_chunks: null | ||
block: | ||
d_state: 128 | ||
d_conv: 4 | ||
expand: 2 | ||
headdim: 64 | ||
dropout: 0.1 | ||
patch_pos_emb_type: null | ||
train: | ||
target_elements: 7_000_000 | ||
target_elements_strategy: batch | ||
batch_size: 6 | ||
max_eval_steps: 1000 | ||
shuffle_train: true | ||
learning_rate: 0.0001 | ||
gradient_clipping: 0.5 | ||
gradient_accumulate_every: 48 | ||
resume: | ||
checkpoint_file: /path-to-checkpoint.pth | ||
next_batch_index: 0 | ||
next_epoch_index: 0 | ||
migrate_embeddings: false | ||
rename_modules: true | ||
resumed_from: null |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
io: | ||
name_model: 8k_30b_360m_1d_t | ||
output_dir: /data/output < set this! | ||
dataset_dir: /data/datasets/pg19 < set this! | ||
dataset_id: pg19 | ||
num_models_to_save: 5 | ||
validate_amount: 100 | ||
log_train_loss_amount: 1000 | ||
params: | ||
num_tokens: 256 | ||
pad_token_id: 0 | ||
input_seq_len: 8192 | ||
seq_lens: [8192] | ||
hidden_dims: [1024] | ||
num_layers: [42] | ||
train_checkpoint_chunks: null | ||
block: | ||
attn_head_dims: 64 | ||
attn_num_heads: 16 | ||
attn_use_rot_embs: true | ||
attn_dropout: 0 | ||
use_flash_attn: true | ||
patch_pos_emb_type: fixed | ||
train: | ||
target_elements: 30_000_000_000 | ||
target_elements_strategy: sequence | ||
batch_size: 4 | ||
shuffle_train: false | ||
learning_rate: 0.001 | ||
gradient_clipping: 1 | ||
gradient_accumulate_every: 12 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
io: | ||
name_model: 8k_30b_360m_2d_ss | ||
output_dir: /data/output < set this! | ||
dataset_dir: /data/datasets/pg19 < set this! | ||
dataset_id: pg19 | ||
num_models_to_save: 5 | ||
validate_amount: 100 | ||
log_train_loss_amount: 1000 | ||
params: | ||
num_tokens: 256 | ||
pad_token_id: 0 | ||
input_seq_len: 8192 | ||
seq_lens: [1024, 8] | ||
hidden_dims: [1024, 1024] | ||
num_layers: [28, 24] | ||
train_checkpoint_chunks: null | ||
block: | ||
d_state: 128 | ||
d_conv: 4 | ||
expand: 2 | ||
headdim: 64 | ||
patch_pos_emb_type: null | ||
train: | ||
target_elements: 30_000_000_000 | ||
target_elements_strategy: sequence | ||
batch_size: 6 | ||
shuffle_train: false | ||
learning_rate: 0.001 | ||
gradient_clipping: 1 | ||
gradient_accumulate_every: 8 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
io: | ||
name_model: 8k_30b_360m_2d_st | ||
output_dir: /data/output < set this! | ||
dataset_dir: /data/datasets/pg19 < set this! | ||
dataset_id: pg19 | ||
num_models_to_save: 5 | ||
validate_amount: 100 | ||
log_train_loss_amount: 1000 | ||
params: | ||
num_tokens: 256 | ||
pad_token_id: 0 | ||
input_seq_len: 8192 | ||
seq_lens: [1024, 8] | ||
hidden_dims: [1024, 1024] | ||
num_layers: [25, 21] | ||
train_checkpoint_chunks: null | ||
block: | ||
- d_state: 128 | ||
d_conv: 4 | ||
expand: 2 | ||
headdim: 64 | ||
patch_pos_emb_type: null | ||
- attn_head_dims: 64 | ||
attn_num_heads: 16 | ||
attn_use_rot_embs: true | ||
attn_dropout: 0 | ||
use_flash_attn: true | ||
patch_pos_emb_type: fixed | ||
train: | ||
target_elements: 30_000_000_000 | ||
target_elements_strategy: sequence | ||
batch_size: 6 | ||
shuffle_train: false | ||
learning_rate: 0.001 | ||
gradient_clipping: 1 | ||
gradient_accumulate_every: 8 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
This folder contains example experiment configurations in yaml format that can be passed to torchrun via: | ||
|
||
```sh | ||
bash src/mblm/scripts/train_launch.sh <config.yaml> | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.