You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
Get a response from model
Current Behavior
llama_get_logits_ith: invalid logits id -1, reason: no logits
[1] 58786 segmentation fault python ragPhiGguf.py --model_path Phi-3.5-mini-instruct-Q4_K_M.gguf --query
/opt/homebrew/Cellar/[email protected]/3.13.0_1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/multiprocessing/resource_tracker.py:276: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown: {'/loky-58786-dm8v3d6t'}
warnings.warn(
Environment and Context
Physical (or virtual) hardware you are using, e.g. for Linux:
arm64 m2 pro
Operating System, e.g. for Linux:
mac
SDK version, e.g. for Linux:
Python 3.13.0
GNU Make 3.81
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
This program built for i386-apple-darwin11.3.0
g++ --version
Apple clang version 16.0.0 (clang-1600.0.26.4)
Target: arm64-apple-darwin24.1.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
Failure Information (for bugs)
segmentation fault
Steps to Reproduce
i was trying to build a RAG script using Phi-3.5-mini-instruct-Q4_K_M.gguf and i am using default chat template
Available chat formats from metadata: chat_template.default
Using gguf chat template: {% for message in messages %}{% if message['role'] == 'system' and message['content'] %}{{'<|system|>
' + message['content'] + '<|end|>
'}}{% elif message['role'] == 'user' %}{{'<|user|>
' + message['content'] + '<|end|>
'}}{% elif message['role'] == 'assistant' %}{{'<|assistant|>
' + message['content'] + '<|end|>
'}}{% endif %}{% endfor %}{% if add_generation_prompt %}{{ '<|assistant|>
' }}{% else %}{{ eos_token }}{% endif %}
def generate_response(
self, query: str, context_docs: List[Dict], max_tokens: int = 1000
) -> str:
"""
Generate response using retrieved documents as context, formatted in GGUF chat template.
Args:
query (str): User query
context_docs (List[Dict]): Retrieved context documents
max_tokens (int): Maximum tokens to generate
Returns:
Generated response
"""
# Construct context from documents
context_texts = []
for doc in context_docs:
context_texts.append(
f"Article ID: {doc.get('article_id', 'N/A')
}\n{doc.get('text', '')}"
)
context = "\n\n".join(context_texts)
# Ensure context fits within the context window
context = self.truncate_text(context)
# Construct GGUF chat template prompt
prompt = (
"{% for message in messages %}"
"{% if message['role'] == 'system' and message['content'] %}"
"{{'<|system|>\n' + message['content'] + '<|end|>\n'}}"
"{% elif message['role'] == 'user' %}"
"{{'<|user|>\n' + message['content'] + '<|end|>\n'}}"
"{% elif message['role'] == 'assistant' %}"
"{{'<|assistant|>\n' + message['content'] + '<|end|>\n'}}"
"{% endif %}{% endfor %}"
"{% if add_generation_prompt %}"
"{{ '<|assistant|>\n' }}"
"{% else %}{{ eos_token }}{% endif %}"
)
template = Template(prompt)
# Replace placeholders with the actual chat history
messages = [
{"role": "system", "content": f"Context:\n{context}"},
{"role": "user", "content": query},
]
formatted_prompt = template.render(
messages=messages, add_generation_prompt=True, eos_token="<|endoftext|>"
)
# Generate response
try:
response = self.llm(
formatted_prompt,
max_tokens=max_tokens,
stop=["<|end|>"],
echo=False,
)
return response["choices"][0]["text"].strip()
except Exception as e:
self.logger.error(f"Response generation error: {e}")
return f"Error generating response: {str(e)}"
The text was updated successfully, but these errors were encountered:
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
Get a response from model
Current Behavior
llama_get_logits_ith: invalid logits id -1, reason: no logits
[1] 58786 segmentation fault python ragPhiGguf.py --model_path Phi-3.5-mini-instruct-Q4_K_M.gguf --query
/opt/homebrew/Cellar/[email protected]/3.13.0_1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/multiprocessing/resource_tracker.py:276: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown: {'/loky-58786-dm8v3d6t'}
warnings.warn(
Environment and Context
Physical (or virtual) hardware you are using, e.g. for Linux:
arm64 m2 pro
Operating System, e.g. for Linux:
mac
SDK version, e.g. for Linux:
Python 3.13.0
GNU Make 3.81
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
This program built for i386-apple-darwin11.3.0
g++ --version
Apple clang version 16.0.0 (clang-1600.0.26.4)
Target: arm64-apple-darwin24.1.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
Failure Information (for bugs)
segmentation fault
Steps to Reproduce
i was trying to build a RAG script using
Phi-3.5-mini-instruct-Q4_K_M.gguf
and i am using default chat templateThe text was updated successfully, but these errors were encountered: