Skip to content

Commit

Permalink
fix make_dummy_tokenizer
Browse files Browse the repository at this point in the history
  • Loading branch information
okdshin committed Nov 19, 2023
1 parent 2fac127 commit 4a0ef75
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions make_dummy_tokenizer.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
import sentencepiece as spm
from pathlib import Path
Path("./dummy_tokenizer").mkdir(exist_ok=True)
spm.SentencePieceTrainer.train(input="dummy_file", model_prefix='dummy_tokenizer/tokenizer', vocab_size=51200, byte_fallback=True)

0 comments on commit 4a0ef75

Please sign in to comment.