Emojis unsupported (vLLM integration) #116

milesial · 2024-06-27T00:55:59Z

Hi, using version 0.10.3 and the llama3 tokenizer, with vLLM, I can't seem to constrain to generate emojis.

 curl --request POST \
  --url http://localhost:8000/v1/chat/completions \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "meta-llama/Meta-Llama-3-8B-Instruct",
  "messages": [
    {
      "content": "",
      "role": "user"
    }
  ],
  "guided_decoding_backend": "lm-format-enforcer",
  "guided_choice": ["🐈"],
  "temperature": 0.0,
  "top_p": 0.7,
  "max_tokens": 100,
  "stream": false
}'

[ERROR] Unknown LMFormatEnforcer Problem. Prefix: ''

Even though the tokenizer supports it

tok = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B")
tok.encode("🐈")
[128000, 9468, 238, 230]

It might be related to multi-tokens characters, outlines had to deal with similar issues: dottxt-ai/outlines#738

The text was updated successfully, but these errors were encountered:

noamgat · 2024-06-27T05:22:35Z

Yes, this is a known limitation of the approach taken by LM Format Enforcer. I will look into how the outlines PR works and see if we can adapt its approach. If anyone wants to take a crack at it, they are more than welcome :)

…

On Thu, Jun 27, 2024 at 3:56 AM milesial ***@***.***> wrote: Hi, using version 0.10.3 and the llama3 tokenizer, with vLLM, I can't seem to constrain to generate emojis. curl --request POST \ --url http://localhost:8000/v1/chat/completions \ --header 'Content-Type: application/json' \ --data '{ "model": "meta-llama/Meta-Llama-3-8B-Instruct", "messages": [ { "content": "", "role": "user" } ], "guided_decoding_backend": "lm-format-enforcer", "guided_choice": ["🐈"], "temperature": 0.0, "top_p": 0.7, "max_tokens": 100, "stream": false }' [ERROR] Unknown LMFormatEnforcer Problem. Prefix: '' Even though the tokenizer supports it tok = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B") tok.encode("🐈") [128000, 9468, 238, 230] It might be related to multi-tokens characters, outlines had to deal with similar issues: dottxt-ai/outlines#738 <dottxt-ai/outlines#738> — Reply to this email directly, view it on GitHub <#116>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKFA2E33F7S5V5SRKF2NUTZJNPLJAVCNFSM6AAAAABJ64YMKOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM3TMNJQGM3TANA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Emojis unsupported (vLLM integration) #116

Emojis unsupported (vLLM integration) #116

milesial commented Jun 27, 2024

noamgat commented Jun 27, 2024 via email

Emojis unsupported (vLLM integration) #116

Emojis unsupported (vLLM integration) #116

Comments

milesial commented Jun 27, 2024

noamgat commented Jun 27, 2024 via email