Skip to content

Commit

Permalink
convert_hf_to_gguf: rwkv tokenizer: Don't escape sequences manually
Browse files Browse the repository at this point in the history
Signed-off-by: Molly Sophia <[email protected]>
  • Loading branch information
MollySophia committed Aug 12, 2024
1 parent bcf29ef commit 9ba8fb6
Showing 1 changed file with 1 addition and 3 deletions.
4 changes: 1 addition & 3 deletions convert_hf_to_gguf.py
Original file line number Diff line number Diff line change
Expand Up @@ -2723,9 +2723,7 @@ def set_vocab(self):
token = token.encode("utf-8") if isinstance(token, str) else token
assert isinstance(token, bytes)
assert len(token) == token_len
token_text: str = ""
for b in token:
token_text += f"\\x{b:02x}"
token_text: str = str(token)[2:-1]
tokens.append(token_text.encode("utf-8"))
toktypes.append(gguf.TokenType.NORMAL)
remainder = vocab_size - len(tokens)
Expand Down

0 comments on commit 9ba8fb6

Please sign in to comment.