-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot return a value from models.llamacpp
#1281
Comments
Here is a solution to the problem in the form of a PR, which is undocumented. I found this with the help of Sonnet 3.5 . Could somebody please reply to this and explain why the provided examples in Outlines work for you? Thanks. Title: Fix llama.cpp cleanup and generator usage patternDescriptionThis PR addresses two issues:
ChangesChanged from problematic pattern in ex5.py: # DON'T DO THIS - Will cause cleanup issues
m = models.llamacpp(
repo_id="M4-ai/TinyMistral-248M-v2-Instruct-GGUF",
filename="TinyMistral-248M-v2-Instruct.Q4_K_M.gguf",
) To proper cleanup pattern in ex6.py: import os
from outlines import models, generate
from dotenv import load_dotenv
import torch
load_dotenv()
hf_token = os.getenv("HF_API_TOKEN")
if not hf_token:
raise ValueError("Hugging Face API token not found. Please set HF_API_TOKEN in .env file.")
def create_model_and_generate(prompt: str):
# Create model directly
model = models.llamacpp(
repo_id="M4-ai/TinyMistral-248M-v2-Instruct-GGUF",
filename="TinyMistral-248M-v2-Instruct.Q4_K_M.gguf",
)
try:
# Create generator without parameters
generator = generate.text(model)
# Generate response with parameters at call time
response = generator(
prompt,
max_tokens=100,
temperature=0.7
)
return response
finally:
# Ensure model cleanup
del model
# Use the function
prompt = "Write a short story about a cat."
result = create_model_and_generate(prompt)
print(result) Key Improvements
TestingThe code has been tested and no longer produces the NoneType error during cleanup. Documentation UpdatesThis pattern follows the Outlines documentation for structured generation(1) which shows that parameters should be passed during generation, not during generator creation. Fixes #[issue_number] |
My goal is to use the Outlines library on my local machine WITHOUT accessing the internet. So far, that is proving quite difficult. Could you please provide a model with Llama.cpp using regex for example, that works? Or with
Here is the full code:
|
Describe the issue as clearly as possible:
Consider the following code:
An error is produced when assigning the return value of
models.llamacpp
to the variablemodel
(third case above). The error does not occur when the return value is not assigned or when it is printed. Is this expected behavior? If it is, this behavior is not documented. Furthermore, the test suite (test_integration_llamacpp.py
) does not appear to test for this case. How do I proceed?Suggestion: in the documented examples, could you state the version of Outlines that example is meant to work with? Breaking changes occur on a regular basis, which might prevent examples from being duplicated easily. Thanks.
Steps/code to reproduce the bug:
Expected result:
Error message:
Outlines/Python version information:
Version information
Context for the issue:
I cannot duplicate the documented examples, or some of the testing modules.
The text was updated successfully, but these errors were encountered: