Skip to content

Commit

Permalink
c4 dataset
Browse files Browse the repository at this point in the history
  • Loading branch information
ahmeda14960 committed Apr 23, 2024
1 parent a98c70b commit 451173c
Showing 1 changed file with 2 additions and 5 deletions.
7 changes: 2 additions & 5 deletions config/llama2_7b_continued.yaml
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
data:
cache_dir: "gs://levanter-data/tokenized/git-llama2/"
train_urls:
- gs://levanter-data/pile-domains/github/{00..29}.jsonl.zst
validation_urls:
- gs://levanter-data/pile-domains/github/val.jsonl.zst
id: allenai/c4
cache_dir: "gs://levanter-data/tokenized/llama2_c4/"
tokenizer: "meta-llama/Llama-2-70b-hf"
model:
type: llama
Expand Down

0 comments on commit 451173c

Please sign in to comment.