Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The latest version kills python kernel with LlamaGrammar #1623

Closed
4 tasks done
yamikumo-DSD opened this issue Jul 25, 2024 · 8 comments · Fixed by #1649
Closed
4 tasks done

The latest version kills python kernel with LlamaGrammar #1623

yamikumo-DSD opened this issue Jul 25, 2024 · 8 comments · Fixed by #1649
Labels
bug Something isn't working

Comments

@yamikumo-DSD
Copy link

yamikumo-DSD commented Jul 25, 2024

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior and Current Behavior

The latest version of llama-cpp-python kills python kernel with LlamaGrammar.
I ran the following code:

from llama_cpp import Llama, LlamaGrammar
model = Llama(model_path="ggufs/Meta-Llama-3-8B-Instruct.Q5_K_M.gguf", verbose=False) # Model doesn't matter.
grammar = LlamaGrammar.from_string('root ::= "a"+')
model("hello", max_tokens=10, grammar=grammar)

When it ran, the python kernel died immediately for unknown reason. Dying kernel doesn't happen without use of LlamaGrammar.
Because this behavior has not been observed recently (actually few days ago), I suspect my recent update of llama-cpp-python module made this problem.

What I tried is:

  1. Build the latest code by myself -> kernel died
  2. Build the code in the latest release by myself -> kernel died
  3. Re-install wheel of the latest release -> No problem

My experiment might suggest this problem comes from backend Llama.cpp and is not llama-cpp-python's fault.
But anyway, I wanna know whether people is experiencing this bug.

Environment

OS: macOS Sonoma
Processor: M2Max 64GB.
Python version: 11

@handshape
Copy link

I'm experiencing something similar on x86_64 with CUDA since 0.2.84 - segfault when calling LlamaGrammar through the JSON-schema binding. No trouble without the grammar constraint. Rolling back to 0.2.83 fixes it.

@yamikumo-DSD
Copy link
Author

yamikumo-DSD commented Jul 29, 2024

I'm experiencing something similar on x86_64 with CUDA since 0.2.84 - segfault when calling LlamaGrammar through the JSON-schema binding. No trouble without the grammar constraint. Rolling back to 0.2.83 fixes it.

Which do you think should we blame for it, llama-cpp-python or native Llama.cpp?
I observe no issue arisen on this problem in Llama.cpp's issue page regardless of increasing demand on function calling, which I see weird. (so, environment dependent thing? but we both using CUDA/MPS have the problem.)
The version of these module really matters as new models popping up every week, and I don't want to stick to the old versions.

@axel7083
Copy link

I ran the following code:

I do reproduce the same error. After entering the code you provided the program stop.

The scenario where the problem occurs can be found in detailed in #1636 (and the hardware/software specification).

Rolling back to 0.2.82 seems to fix the problem: the code provided execute without crash/issue on previous version.

@yamikumo-DSD
Copy link
Author

I ran the following code:

I do reproduce the same error. After entering the code you provided the program stop.

The scenario where the problem occurs can be found in detailed in #1636 (and the hardware/software specification).

Rolling back to 0.2.82 seems to fix the problem: the code provided execute without crash/issue on previous version.

I found recent Llama.cpp did split the main source code (llama.cpp) into several part;

llama-grammar.cpp
llama-sampling.cpp
llama-vocab.cpp

These source files found in the latest release cannot be found in releases a week ago.
though I'm not familiar with binding method between C++ and Python, i suspect some internal changes in Llama.cpp causes mismatch of communication between C++-implemented binaries and Python.

@handshape
Copy link

Looks like work is underway: #1637

@abetlen abetlen added the bug Something isn't working label Aug 1, 2024
@ProfessorAO
Copy link

Anyone find a solution for this ?

@ProfessorAO
Copy link

Getting the same error, both in colab and in personal. On vs code it comes up as an OS Error or Win Error.

@handshape
Copy link

Getting the same error, both in colab and in personal. On vs code it comes up as an OS Error or Win Error.

Pretty sure this PR addresses it, pending the next release: #1649

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants