Skip to content

Commit

Permalink
Add sample.py
Browse files Browse the repository at this point in the history
  • Loading branch information
finetunej authored Apr 17, 2023
1 parent cc8a748 commit 04fde42
Showing 1 changed file with 13 additions and 0 deletions.
13 changes: 13 additions & 0 deletions sample.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
import sentencepiece as spm

s = spm.SentencePieceProcessor(model_file='novelai.model')

text = "The quick brown fox jumps over the goblin."

print("Text:", text)

print("Token IDs:", s.encode(text))
# Token IDs: [541, 1939, 6573, 22820, 22734, 712, 336, 34477, 49230]

print("Readable tokens:", s.encode(text, out_type=str))
# Readable tokens: ['The', '▁quick', '▁brown', '▁fox', '▁jumps', '▁over', '▁the', '▁goblin', '.']

0 comments on commit 04fde42

Please sign in to comment.