-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Unintended behavior in llama_decode_internal when cparams.embeddings is True and cparams.pooling_type is LLAMA_POOLING_TYPE_NONE #8956
Comments
Hmm, I just added support for pooling type none in llama-embedding and from what I see it works just fine. I think |
Thank you! I think your change in #8960 came just 15 hours after I posted this. As of your change and using llama-embedding
I can now run e.g. Interestingly, for this model, running
|
@hitzelc Indeed I confirm that it still fails for gemma-2. This model is a bit special, as it has some additional tensor operations between "result_norm" and "result_output" because of logits soft-capping:
That causes the code that looks for This little patch shall fix the problem (it searches the whole graph for
|
…8956) Co-authored-by: Stanisław Szymczyk <[email protected]>
Nice, working for me as well, and I very much appreciate your explaining the difference between models. Thank you kindly. |
…gerganov#8956) Co-authored-by: Stanisław Szymczyk <[email protected]>
…gerganov#8956) Co-authored-by: Stanisław Szymczyk <[email protected]>
In llama_decode_internal, it is currently the case that if cparams.embeddings is True but cparams.pooling_type is 0, the following will always result in a failed assertion:
llama.cpp/src/llama.cpp
Lines 14619 to 14625 in 6afd1a9
I noticed this bool:
const bool embd_pooled = cparams.embeddings && cparams.pooling_type != LLAMA_POOLING_TYPE_NONE;
is set just a few lines above, so I'm thinking there might be some intended behavior for when no pooling strategy is selected?
Or perhaps there should simply be a warning or a default pooling strategy forced here?
The text was updated successfully, but these errors were encountered: