Bug: Why llamafile don't remove end token like <|eot_id|> or <end_of_turn>? #630

jeezrick · 2024-11-15T06:21:11Z

Contact Details

What happened?

When I use llamafile with python api. But for 2 models I use, they all retain the end token in response string, that I need to manually remove, is that my problem?
like this :

        if self.model_string == "LLaMA_CPP": # why llama_file don't remove end token?
            self.response_str = self.response_str.replace("<|eot_id|>", "")
        if self.model_string == "gemma-2b-it":
            self.response_str = self.response_str.replace("<end_of_turn>", "")

Version

llamafile v0.8.4

What operating system are you seeing the problem on?

Linux

Relevant log output

model_gemma("I have a head of broccoli, and a cabbage. How many fruits do I have?")

output:

'You have **zero** fruits! 🥦 🥬 \n\nBroccoli and cabbage are both vegetables, not fruits. \n<end_of_turn>'

The text was updated successfully, but these errors were encountered:

yusufsyaifudin · 2024-11-24T09:50:04Z

I also encounter this issue and end up using Ollama again (want to use llamafile because it has tokenize/detokenize API)

here's the reproducible scripts:

ollama pull llama3.2:3b-instruct-q5_K_M
./llamafile -m /Users/username/.ollama/models/blobs/sha256-05fc42664a9311c427413f9bf2077bd5ee7d59d6a5a034d54fc738f93976d065 --server --nobrowser

Then call some chat completions using API:

curl --location 'http://127.0.0.1:8080/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
  "stream": false,
  "messages": [
    {
        "role": "system",
        "content": "You'\''re a helpful assistant!"
    },
    {
        "role": "user",
        "content": "Why sky is blue?"
    }
  ],
  "temperature": 0.1,
  "cache_prompt": true
}'

Will return:

{
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "The sky appears blue to us because of a phenomenon called Rayleigh scattering. Here's a simplified explanation:\n\n1. **Sunlight and its components**: When sunlight enters Earth's atmosphere, it's made up of different colors, which are a result of the different wavelengths of light. These colors include red, orange, yellow, green, blue, indigo, and violet.\n2. **Scattering by tiny molecules**: The atmosphere is filled with tiny molecules of gases like nitrogen (N2) and oxygen (O2). When sunlight hits these molecules, it scatters in all directions.\n3. **Shorter wavelengths scatter more**: The smaller wavelengths of light, like blue and violet, are scattered more than the longer wavelengths, like red and orange. This is because the smaller molecules are more effective at scattering the shorter wavelengths.\n4. **Our eyes perceive the scattered light**: As the scattered light reaches our eyes, we see the sky as blue because our eyes are most sensitive to the blue and violet wavelengths. The scattered light is more intense in the blue and violet parts of the spectrum, making the sky appear blue to us.\n5. **The blue color we see is a result of the scattering**: The blue color we see is not actually the color of the light itself, but rather the result of the scattering of sunlight by the tiny molecules in the atmosphere.\n\nThis is why the sky appears blue during the daytime, especially in the direction of the sun. At sunrise and sunset, the light has to travel through more of the atmosphere, which scatters the shorter wavelengths even more, making the sky appear more red or orange.\n\nI hope that helps you understand why the sky is blue!<|eot_id|>",
                "role": "assistant"
            }
        }
    ],
    "created": 1732441733,
    "id": "chatcmpl-PeSIXi0WMsbggQ4INjtfqtkHXY1qq8cD",
    "model": "unknown",
    "object": "chat.completion",
    "usage": {
        "completion_tokens": 340,
        "prompt_tokens": 26,
        "total_tokens": 366
    }
}

[PS] 05fc42664a9311c427413f9bf2077bd5ee7d59d6a5a034d54fc738f93976d065 is the digest of llama3.2:3b-instruct-q5_K_M model.

I am using the llamafile 0.8.16

pawel665j · 2024-11-24T18:12:52Z

I'll try to shoot a video and send a link to the repositories I want to run, right now I'm running a light version consisting of one exe file. My system Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz 2.60 GHz (processors: 2) 128 ГБ Windows 10 Pro Release Version 22H2 Installation Date ‎06.‎04.‎2024 OS Build 19045.5131 Interoperability Windows Feature Experience Pack 1000.19060.1000.0

…

________________________________ From: Yusuf Syaifudin ***@***.***> Sent: Sunday, November 24, 2024 12:50 PM To: Mozilla-Ocho/llamafile ***@***.***> Cc: Subscribed ***@***.***> Subject: Re: [Mozilla-Ocho/llamafile] Bug: Why llamafile don't remove end token like <|eot_id|> or <end_of_turn>? (Issue #630) I also encounter this issue and end up using Ollama again (want to use llamafile because it has tokenize/detokenize API) here's the reproducible scripts: ollama pull llama3.2:3b-instruct-q5_K_M ./llamafile -m /Users/username/.ollama/models/blobs/sha256-05fc42664a9311c427413f9bf2077bd5ee7d59d6a5a034d54fc738f93976d065 --server --nobrowser Then call some chat completions using API: curl --location 'http://127.0.0.1:8080/v1/chat/completions' \ --header 'Content-Type: application/json' \ --data '{ "stream": false, "messages": [ { "role": "system", "content": "You'\''re a helpful assistant!" }, { "role": "user", "content": "Why sky is blue?" } ], "temperature": 0.1, "cache_prompt": true }' Will return: { "choices": [ { "finish_reason": "stop", "index": 0, "message": { "content": "The sky appears blue to us because of a phenomenon called Rayleigh scattering. Here's a simplified explanation:\n\n1. **Sunlight and its components**: When sunlight enters Earth's atmosphere, it's made up of different colors, which are a result of the different wavelengths of light. These colors include red, orange, yellow, green, blue, indigo, and violet.\n2. **Scattering by tiny molecules**: The atmosphere is filled with tiny molecules of gases like nitrogen (N2) and oxygen (O2). When sunlight hits these molecules, it scatters in all directions.\n3. **Shorter wavelengths scatter more**: The smaller wavelengths of light, like blue and violet, are scattered more than the longer wavelengths, like red and orange. This is because the smaller molecules are more effective at scattering the shorter wavelengths.\n4. **Our eyes perceive the scattered light**: As the scattered light reaches our eyes, we see the sky as blue because our eyes are most sensitive to the blue and violet wavelengths. The scattered light is more intense in the blue and violet parts of the spectrum, making the sky appear blue to us.\n5. **The blue color we see is a result of the scattering**: The blue color we see is not actually the color of the light itself, but rather the result of the scattering of sunlight by the tiny molecules in the atmosphere.\n\nThis is why the sky appears blue during the daytime, especially in the direction of the sun. At sunrise and sunset, the light has to travel through more of the atmosphere, which scatters the shorter wavelengths even more, making the sky appear more red or orange.\n\nI hope that helps you understand why the sky is blue!<|eot_id|>", "role": "assistant" } } ], "created": 1732441733, "id": "chatcmpl-PeSIXi0WMsbggQ4INjtfqtkHXY1qq8cD", "model": "unknown", "object": "chat.completion", "usage": { "completion_tokens": 340, "prompt_tokens": 26, "total_tokens": 366 } } [PS] 05fc42664a9311c427413f9bf2077bd5ee7d59d6a5a034d54fc738f93976d065 is the digest of llama3.2:3b-instruct-q5_K_M model. I am using the llamafile 0.8.16 — Reply to this email directly, view it on GitHub<#630 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AOOFRCES2QHV5UEGLPG3JLT2CGOOFAVCNFSM6AAAAABR2PL5ISVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOJVHEYDMNRUGA>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

jeezrick added bug low severity labels Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Why llamafile don't remove end token like <|eot_id|> or <end_of_turn>? #630

Bug: Why llamafile don't remove end token like <|eot_id|> or <end_of_turn>? #630

jeezrick commented Nov 15, 2024

yusufsyaifudin commented Nov 24, 2024

pawel665j commented Nov 24, 2024 via email

Bug: Why llamafile don't remove end token like <|eot_id|> or <end_of_turn>? #630

Bug: Why llamafile don't remove end token like <|eot_id|> or <end_of_turn>? #630

Comments

jeezrick commented Nov 15, 2024

Contact Details

What happened?

Version

What operating system are you seeing the problem on?

Relevant log output

yusufsyaifudin commented Nov 24, 2024

pawel665j commented Nov 24, 2024 via email