You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
llama_get_logits_ith: invalid logits id -1 error when embedding=True
Expected Behavior
When using llama-cpp-python with Qwen2 model, the chat completion should work normally regardless of whether the embedding parameter is enabled or not.
Current Behavior
The model works fine when embedding=False, but throws an error llama_get_logits_ith: invalid logits id -1, reason: no logits when embedding=True.
Working Code Example
fromllama_cppimportLlama# This works finellm=Llama(
model_path="./models/qwen2-0_5b-instruct-q8_0.gguf",
chat_format="chatml",
verbose=False
)
messages= [
{"role": "system", "content": "Summarize this text for me: You are an assistant who creates short stories."},
{"role": "user", "content": "Long ago, in a peaceful village, a little girl named Leah loved watching the stars at night..."}
]
response=llm.create_chat_completion(messages=messages)
'''{'id': 'chatcmpl-17ca45ef-d13b-425a-96be-7631e3b9a7f4', 'object': 'chat.completion', 'created': 1730125699, 'model': './models/qwen2-0_5b-instruct-q8_0.gguf', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'This text is a short story about a little girl named Leah who loves watching the stars at night. One day, she noticed a particularly bright star that seemed to wink at her, and she made a wish to become friends with the star. This star spirit helped Leah take her on a magical adventure among the stars, and she visited countless constellations and stardust rivers.'}, 'logprobs': None, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 145, 'completion_tokens': 76, 'total_tokens': 221}}'''# Works successfully
Error Reproduction
fromllama_cppimportLlama# This causes an errorllm=Llama(
model_path="./models/qwen2-0_5b-instruct-q8_0.gguf",
chat_format="chatml",
verbose=False,
embedding=True# Only difference is enabling embedding
)
messages= [
{"role": "system", "content": "Summarize this text for me: You are an assistant who creates short stories."},
{"role": "user", "content": "Long ago, in a peaceful village, a little girl named Leah loved watching the stars at night..."}
]
llm.create_chat_completion(messages=messages)
# Error: llama_get_logits_ith: invalid logits id -1, reason: no logitsembeddings=llm.create_embedding("Hello, world!")
# Here is normal'''{'object': 'list', 'data': [{'object': 'embedding', 'embedding': [[0.9160200953483582, 5.090432167053223, 1.487088680267334, ......'''
Environment Info
Python version: 3.10
llama-cpp-python version: latest
Model: Qwen2-0.5B-Chat (GGUF format)
Steps to Reproduce
Install llama-cpp-python
Download Qwen2-0.5B-Chat GGUF model
Run the error reproduction code above with embedding=True
Additional Context
The error only occurs when:
The embedding parameter is set to True
Using the chat completion functionality
The model works fine for chat completion when embedding=False, suggesting this might be related to how the embedding functionality is implemented for this specific model.
The text was updated successfully, but these errors were encountered:
I was getting this same error with a Qwen2.5-14b finetune and spent a few hours searching for the answer. It became obvious to me that this was a regression in the llama-cpp codebase, and it may have been addressed recently. Not sure if llama-cpp-python has received upstream patches yet or not, but this may be fixed in the future.
For now, I've resorted to using an embedding specific model with SentenceTransformer, but I'd love to ideally use the same model to get embeddings and and generations from the same model to save on memory.
llama_get_logits_ith: invalid logits id -1 error when embedding=True
Expected Behavior
When using llama-cpp-python with Qwen2 model, the chat completion should work normally regardless of whether the embedding parameter is enabled or not.
Current Behavior
The model works fine when
embedding=False
, but throws an errorllama_get_logits_ith: invalid logits id -1, reason: no logits
whenembedding=True
.Working Code Example
Error Reproduction
Environment Info
Steps to Reproduce
embedding=True
Additional Context
The error only occurs when:
embedding
parameter is set toTrue
The model works fine for chat completion when
embedding=False
, suggesting this might be related to how the embedding functionality is implemented for this specific model.The text was updated successfully, but these errors were encountered: