Skip to content

Commit

Permalink
add token counter
Browse files Browse the repository at this point in the history
  • Loading branch information
ahmeda14960 committed Nov 18, 2024
1 parent 03aa29d commit 824b63c
Showing 1 changed file with 5 additions and 0 deletions.
5 changes: 5 additions & 0 deletions examples/count_tokens.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from levanter.store import JaggedArrayStore

a = JaggedArrayStore.open("gs://marin-us-central2/tokenized/dolma/algebraic-stack-cc00cf/train/input_ids", dtype=int)

a.data_size

0 comments on commit 824b63c

Please sign in to comment.