Skip to content

Commit

Permalink
llama : implement Unigram tokenizer needed by T5 and FLAN-T5 model fa…
Browse files Browse the repository at this point in the history
…milies (#5763)

* llama : add T5 model architecture, tensors and model header parameters

* llama : add implementation of Unigram tokenizer with SentencePiece-like text normalization using precompiled charsmap

---------

Co-authored-by: Stanisław Szymczyk <[email protected]>
  • Loading branch information
fairydreaming and sszymczy authored Jun 25, 2024
1 parent e6bf007 commit 6fcbf68
Show file tree
Hide file tree
Showing 4 changed files with 587 additions and 39 deletions.
Loading

0 comments on commit 6fcbf68

Please sign in to comment.