Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GBNF Validator Program #5948

Merged
merged 3 commits into from
Apr 4, 2024
Merged

Conversation

HanClinto
Copy link
Collaborator

This adds a program to do validation of arbitrary example input files against a GBNF grammar file. I built this to help me in some other experiments that I'm doing with using GBNF to guarantee syntactically-correct queries when doing text-to-SQL, and I needed a utility to help me validate my grammar.

I originally started building this tool in an external language (Python), but found myself re-implementing the parser structure, and I figured that it would be safer to use the implementation within llama.cpp itself to ensure 1-to-1 behavior matching, so switched to C++ instead of Python.

Usage is:

✗ ./gbnf-validator grammars/chess.gbnf chess-sample.txt
Input string is valid according to the grammar.

✗ ./gbnf-validator grammars/chess.gbnf chess-sample-range.txt
Error: Input string does not match the grammar at index 24
Row: 3, Column: 5
Context:
4 e5\n2. Nf3 Nc6\n3. f9 a3\n
                      ^

✗ ./gbnf-validator grammars/chess.gbnf chess-sample-comment.txt
Error: Input string does not match the grammar at index 39
Row: 4, Column: 1
Context:
\n3. d4 d5\n4. d3 Nd7\n# Good move!\n5. e7 e
                       ^

I'm not sold that I did it in a very good way. I had to expose a lot of the grammar functions and structures via the internal API, and it feels like it might be a little too hacked together. That said, it feels like a useful tool as-is, and I at least wanted to submit it for review to get some feedback.

@HanClinto
Copy link
Collaborator Author

I've been using this program the past several days, and while it has been invaluable for me to debug my grammars, there are a number of flaws in it.

In hindsight, I should not have developed it this way at all, and instead oriented everything around sampling.cpp. I will attempt to rewrite this program. Please do not merge as-is.

@cebtenzzre cebtenzzre marked this pull request as draft March 12, 2024 19:18
@HanClinto HanClinto force-pushed the feature_gbnf_validator branch from 2878dd7 to cb63990 Compare March 16, 2024 03:01
@HanClinto
Copy link
Collaborator Author

Okay. Program is now re-written. This implementation feels much cleaner and concise to me, and overall I'm pretty happy with how things are structured.

It gives errors when there are grammar parsing errors:

./gbnf-validator grammars/chess_bad.gbnf chess-sample.txt
parse: error parsing grammar: Undefined rule identifier 'nonpawn'
main: failed to parse grammar

Or if it is missing a root node:

./gbnf-validator grammars/chess_noroot.gbnf chess-sample.txt
main: grammar does not contain a 'root' symbol

If the grammar is good, then it checks the input file to look for errors. If there are bad characters, then it highlights them in red:

./gbnf-validator grammars/chess.gbnf chess-sample-range.txt
Input string is invalid according to the grammar.
Error: Unexpected character '9' at position 24

Input string:
1. e4 e5
2. Nf3 Nc6
3. f9 a3

(in real program, it uses text highlighting to mark the invalid characters in red)
image

If the file has EOF before the grammar is fully parsed, then in reports that too:

./gbnf-validator grammars/chess.gbnf chess-sample-short.txt
Input string is invalid according to the grammar.
Error: Unexpected end of input at position 18

Input string:
1. e4 e5
2. Nf3 Nc%

If everything is happy, then it reports as-follows:

./gbnf-validator grammars/chess.gbnf chess-sample.txt
Input string is valid according to the grammar.

If there are suggestions or requests, please let me know!

I'm sorry I wasn't able to implement this without any changes to the core, but I tried to keep them to a minimum. I hope that this could eventually work towards a foundation of building some human-readable end-to-end integration tests to validate the grammar engine, but I figured I would save that for later.

@HanClinto HanClinto force-pushed the feature_gbnf_validator branch from cb63990 to 7320bf1 Compare March 16, 2024 03:20
Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should look into exposing a llama_grammar_context API in llama.h and integrating the grammar parsing functionality in the core library. For now we can merge this as an intermediate stage

examples/gbnf-validator/gbnf-validator.cpp Outdated Show resolved Hide resolved
@ochafik
Copy link
Collaborator

ochafik commented Mar 19, 2024

It may be valuable to expose the validator as an internal API too, e.g. to write e2e tests for the JSON -> grammar conversion (see #5978), for instance checking that some non-schema-compliant JSON is rejected by the converted grammars.

@HanClinto
Copy link
Collaborator Author

It may be valuable to expose the validator as an internal API too, e.g. to write e2e tests for the JSON -> grammar conversion (see #5978), for instance checking that some non-schema-compliant JSON is rejected by the converted grammars.

That's an interesting idea.

There have been a couple of recent PRs (#6004, #5950) that have expanded the "core" capabilities of the grammar parser for validation, so now the piece of code that validates the grammar is pretty short:

    // Parse the GBNF grammar
    auto parsed_grammar = grammar_parser::parse(grammar_str.c_str());

    // will be empty (default) if there are parse errors
    if (parsed_grammar.rules.empty()) {
        fprintf(stderr, "%s: failed to parse grammar\n", __func__);
        return 1;
    }

    // Ensure that there is a "root" node.
    if (parsed_grammar.symbol_ids.find("root") == parsed_grammar.symbol_ids.end()) {
        fprintf(stderr, "%s: grammar does not contain a 'root' symbol\n", __func__);
        return 1;
    }

This code is copied nearly verbatim from llama_sampling_init() in sampling.cpp:
https://github.com/ggerganov/llama.cpp/blob/master/common/sampling.cpp#L3

The main reason I didn't use that function in my program is that I didn't want to bother with generating a full sampling context, but in hindsight maybe it wouldn't be so bad.

Regardless, if in your server you're generating sampling contexts, then you should (as of those recent PRs) have enough information to at least do basic validation of your grammars.

Would that be enough for your needs, or would something additional be helpful?

@ochafik
Copy link
Collaborator

ochafik commented Mar 20, 2024

There have been a couple of recent PRs (#6004, #5950) that have expanded the "core" capabilities of the grammar parser for validation, so now the piece of code that validates the grammar is pretty short:

Thanks for the snippet, actually I've already added validation to #5978 tests (see verify_expectations_parseable in test-json-schema-to-grammar.cpp), but this PR (cool job btw!) will allow going one step further, e.g. testing that the grammar generated for the json schema {"type": "string", "pattern": "[a-z]{2,4}"} parses "aa" and "llam" but not "llama", without having to scrutinize the grammar itself (someone already kindly pointed out I had the wrong rule for dates for instance, my mental grammar interpreter is buggy 🫣)

@HanClinto
Copy link
Collaborator Author

Thanks for the snippet, actually I've already added validation to #5978 tests (see verify_expectations_parseable in test-json-schema-to-grammar.cpp), but this PR (cool job btw!) will allow going one step further, e.g. testing that the grammar generated for the json schema {"type": "string", "pattern": "[a-z]{2,4}"} parses "aa" and "llam" but not "llama", without having to scrutinize the grammar itself (someone already kindly pointed out I had the wrong rule for dates for instance, my mental grammar interpreter is buggy 🫣)

Aaah, that makes more sense to me. Yes, that sounds really great!

I'm not sure the best way to integrate this code into the core -- I think my original intention was for my llama_sample_grammar_string() to be a parallel to llama_sampling_sample_impl(), but in hindsight I think I diverged a bit more. I'm very open to suggestions for where this should eventually move.

I'm finishing up the requested changes (moving away from streams to fread and fprintf) and should have an updated version of this PR soon.

@HanClinto HanClinto force-pushed the feature_gbnf_validator branch from 084bfa4 to f5c7582 Compare March 21, 2024 17:42
@HanClinto
Copy link
Collaborator Author

Changes are in place -- ready for re-review.

@HanClinto HanClinto marked this pull request as ready for review March 22, 2024 14:06
@HanClinto
Copy link
Collaborator Author

@ggerganov I think I might have accidentally had this marked as draft before, which might have blocked it from getting merged.

Is there anything more I should do before this can be merged in?

@ggerganov ggerganov merged commit 9b84ae1 into ggerganov:master Apr 4, 2024
46 checks passed
@ggerganov
Copy link
Owner

I didn't merge it because it was draft and wasn't sure if more things were planned. All good now!

tybalex pushed a commit to rubra-ai/tools.cpp that referenced this pull request Apr 17, 2024
* Revising GBNF validator program to be much simpler.

* Changing from streams to using cstdio

* Adding final newline character.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants