Skip to content

Commit

Permalink
Release v1.1.1
Browse files Browse the repository at this point in the history
  • Loading branch information
belladoreai committed Jun 24, 2023
1 parent 9d97ab5 commit cb34d22
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 3 deletions.
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ Intended use case is calculating token count accurately on the client-side.

<a href="https://belladoreai.github.io/llama-tokenizer-js/example-demo/build/">Click here for demo</a>

Features:
## Features

- Easy to use: 0 dependencies, code and data baked into a single file.
- Compatible with most LLaMA-based models (see [Compatibility](#compatibility))
- Optimized running time: tokenize a sentence in roughly 1ms, or 2000 tokens in roughly 20ms.
Expand Down Expand Up @@ -52,7 +53,7 @@ llamaTokenizer.decode([1, 15043, 3186, 29991])
> 'Hello world!'
```

Special use case: decode only selected individual tokens, without including beginning of prompt token and preceeding space:
Note that special "beginning of sentence" token and preceding space are added by default when encoded (and correspondingly expected when decoding). These affect token count. There may be some use cases where you don't want to add these. You can pass additional boolean parameters in these use cases. For example, if you want to decode an individual token:

```
llamaTokenizer.decode([3186], false, false)
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "llama-tokenizer-js",
"version": "1.1.0",
"version": "1.1.1",
"description": "JS tokenizer for LLaMA-based LLMs",
"main": "llama-tokenizer.js",
"scripts": {
Expand Down

0 comments on commit cb34d22

Please sign in to comment.