Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Unable to load grammar from json.gbnf example #7991

Closed
vecorro opened this issue Jun 18, 2024 · 10 comments
Closed

Bug: Unable to load grammar from json.gbnf example #7991

vecorro opened this issue Jun 18, 2024 · 10 comments
Labels
bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) stale

Comments

@vecorro
Copy link

vecorro commented Jun 18, 2024

What happened?

I have tried to load the json.gbnf grammar example but haven't been able to do so. The following code is not working.

from llama_cpp.llama import Llama, LlamaGrammar
import httpx
grammar_text = httpx.get("https://raw.githubusercontent.com/ggerganov/llama.cpp/master/grammars/json.gbnf").text
grammar = LlamaGrammar.from_string(grammar_text)

This throws the following error:

ValueError: from_string: error parsing grammar file: parsed_grammar.rules is empty

I'm not sure if the problem resides in the grammar definition file or in the LlamaGrammar class. The problem shows up when I use the .from_file method as well.

Name and Version

Ubuntu 22.04
Python 3.11 (Anaconda)
llama_cpp_python 0.2.78

What operating system are you seeing the problem on?

Linux

Relevant log output

parse: error parsing grammar: expecting ')' at {4}) # escapes
  )* "\"" ws

number ::= ("-"? ([0-9] | [1-9] [0-9]{0,15})) ("." [0-9]+)? ([eE] [-+]? [0-9] [1-9]{0,15})? ws

# Optional space: by convention, applied in this grammar after literal chars when allowed
ws ::= | " " | "\n" [ \t]{0,20}


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[14], line 4
      2 import httpx
      3 grammar_text = httpx.get("https://raw.githubusercontent.com/ggerganov/llama.cpp/master/grammars/json.gbnf").text
----> 4 grammar = LlamaGrammar.from_string(grammar_text)

File ~/miniconda3/envs/llama-cpp/lib/python3.11/site-packages/llama_cpp/llama_grammar.py:71, in LlamaGrammar.from_string(cls, grammar, verbose)
     69 parsed_grammar = parse(const_char_p(grammar))  # type: parse_state
     70 if parsed_grammar.rules.empty():
---> 71     raise ValueError(
     72         f"{cls.from_string.__name__}: error parsing grammar file: parsed_grammar.rules is empty"
     73     )
     74 if verbose:
     75     print(f"{cls.from_string.__name__} grammar:", file=sys.stderr)

ValueError: from_string: error parsing grammar file: parsed_grammar.rules is empty
@vecorro vecorro added bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) labels Jun 18, 2024
@matteoserva
Copy link
Contributor

(I am not a developer)
It looks like a problem in a downstream project.
I suggest opening a issue there:
https://github.com/abetlen/llama-cpp-python

@jabberjabberjabber
Copy link

jabberjabberjabber commented Jun 21, 2024

Look at the older versions of the json.gnbf file, find the one that ggerganov made, and use that.

@C0deMunk33
Copy link

Just ran into this same problem, the older file works, the compiler doesn't seem to like the {4} part of it. I also reverted to the latest before this change

@TheMrCodes
Copy link

Same problem here *, ?, + are working for repetition but not with curly brackets like {0,5}, {4}, {1,16} or so

@TheMrCodes
Copy link

Ok after some quick debugging it seems like to be a problem with the llama-cpp-python library.
They translated the parsing logic into python code and this code doesn't support repetition with curly brackets
Reference: https://github.com/abetlen/llama-cpp-python/blob/01bddd669ca1208f1844ce8d0ba9872532641c9d/llama_cpp/llama_grammar.py#L837

@TheMrCodes
Copy link

Also tested my grammar file with the llama.cpp CLI file and I works like expected

@TheMrCodes
Copy link

Library Issue Reference: abetlen/llama-cpp-python#1547

@HanClinto
Copy link
Collaborator

Just ran into this same problem, the older file works, the compiler doesn't seem to like the {4} part of it. I also reverted to the latest before this change

Support for discrete repetition operators was only added about 3 weeks ago in #6640 -- so I'm curious to know where exactly the mismatch is at

@taellinglin
Copy link

Has this issue been solved? I'd really like to pass my grammar file as an argument to the api request, is there a specific way to format it?

@github-actions github-actions bot added the stale label Sep 1, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) stale
Projects
None yet
Development

No branches or pull requests

7 participants