Skip to content

Commit

Permalink
WIP
Browse files Browse the repository at this point in the history
  • Loading branch information
ahmeda14960 committed Nov 20, 2024
1 parent 824b63c commit 02fd529
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion examples/count_tokens.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,6 @@

a = JaggedArrayStore.open("gs://marin-us-central2/tokenized/dolma/algebraic-stack-cc00cf/train/input_ids", dtype=int)

a.data_size
a.data_size

150,849,275

0 comments on commit 02fd529

Please sign in to comment.