Skip to content

Commit

Permalink
Fix trailing ws
Browse files Browse the repository at this point in the history
  • Loading branch information
mathijshenquet committed Aug 22, 2024
1 parent 0c5baa1 commit 0d198bb
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 5 deletions.
2 changes: 1 addition & 1 deletion examples/server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -533,7 +533,7 @@ With input 'á' (utf8 hex: C3 A1) on tinyllama/stories260k
```json
{
"tokens": [
{"id": 198, "piece": [195]}, // hex C3
{"id": 198, "piece": [195]}, // hex C3
{"id": 164, "piece": [161]} // hex A1
]
}
Expand Down
6 changes: 3 additions & 3 deletions examples/server/tests/features/server.feature
Original file line number Diff line number Diff line change
Expand Up @@ -104,15 +104,15 @@ Feature: llama.cpp server
Then tokens begin with BOS
Given first token is removed
Then tokens can be detokenized

Scenario: Tokenize with pieces
When tokenizing with pieces:
"""
What is the capital of Germany?
What is the capital of Germany?
"""
Then tokens are given with pieces

Scenario: Models available
Given available models
Then 1 models are supported
Expand Down
2 changes: 1 addition & 1 deletion examples/server/utils.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -603,7 +603,7 @@ static bool is_valid_utf8(const std::string & str) {
bytes += 3;
} else if ((*bytes & 0xF8) == 0xF0) {
// 4-byte sequence (11110xxx 10xxxxxx 10xxxxxx 10xxxxxx)
if (end - bytes < 4 || (bytes[1] & 0xC0) != 0x80 ||
if (end - bytes < 4 || (bytes[1] & 0xC0) != 0x80 ||
(bytes[2] & 0xC0) != 0x80 || (bytes[3] & 0xC0) != 0x80)
return false;
bytes += 4;
Expand Down

0 comments on commit 0d198bb

Please sign in to comment.