-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama : move vocab, grammar and sampling into separate files #8508
Conversation
d4f8f52
to
516746a
Compare
db39019
to
0049b1a
Compare
da7f831
to
dc96d90
Compare
ec7c6d9
to
8c5f2c2
Compare
Are we mitigating breakages this time? |
8c5f2c2
to
0c14b04
Compare
0c14b04
to
39fbaf9
Compare
This is a first step in partitioning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good for a first step, it will probably need more work to completely decouple the different components. Some notes:
llama_get_vocab
,llama_get_sampling
are unused and probably should be removedllm_load_vocab
should eventually be moved tollama-vocab.cpp
- The symbols in
unicode.h
andunicode-data.h
should have allama_
prefix, or moved to a namespace LLAMA_API_INTERNAL
inllama.h
should be removed, and tests should include the private headers instead
I think it's reasonable if you can avoid breakages |
I started implementing that in #8643. Will look to merge this PR in the meantime to avoid resolving bigger conflicts |
…ov#8508) * llama : move sampling code into llama-sampling ggml-ci * llama : move grammar code into llama-grammar ggml-ci * cont ggml-ci * cont : pre-fetch rules * cont ggml-ci * llama : deprecate llama_sample_grammar * llama : move tokenizers into llama-vocab ggml-ci * make : update llama.cpp deps [no ci] * llama : redirect external API to internal APIs ggml-ci * llama : suffix the internal APIs with "_impl" ggml-ci * llama : clean-up
Some refactoring attempts, mainly trying to reorganize the
llama
code to prepare for #5214 and #5215API Changes:
llama_sample_grammar
->llama_grammar_sample
Summary:
llama_vocab
tollama-vocab.h/.cpp
llama.cpp
tollama-vocab.cpp
llama_sample_
implementation tollama-sampling.h/.cpp
llama_grammar_
implementation tollama-grammar.h/.cpp
TODO:
Makefile
header deps_impl
for consistencyThe reason for this change is to be able to more easily distinguish public from private calls and not rely on function overloads. For example:
llama.cpp:llama_set_rng_seed
->llama-samlping.cpp:llama_set_rng_seed_impl
) is to decouplellama-sampling.cpp
fromllama_context
Conflicting PRs:
Follow-up PRs:
_internal
suffixes to_impl