-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: error handling #5776
server: error handling #5776
Conversation
Can you migrate this change to latest master? As mentioned in #5882 (comment) , we may want the error response to be compatible with OpenAI format. Could you work on that? Thanks. |
FYI, I don't want to return an error in this case to match OpenAI behavior: it still returns embedding even for empty prompt. This problem is fixed in the refactor PR, and the server now can return embedding even with empty prompt (no need to add space, because BOS token is now added). |
Done.
Changed
Removed the error on empty prompt. However, my fix was mostly for ❯ curl -sS --data '{"n_predict":10, "prompt":""}' http://127.0.0.1:8080/completion | jq .content
"" I requested 10 tokens and it returned nothing. Does it mean that for |
Yeah seem like BOS is not added for empty prompt. It's more likely be a bug with detokenize function. |
Should be fixed in: #5953 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It uses exceptions. The main pro is that stack unwinding is really convenient. The main con is that a developer must not forget to de-init everything if needed before throwing the exception.
I prefer to not use exceptions, so introducing llama_error
is not OK. For handling errors, the functions should return error codes or bool
I see. So this whole PR does not make any sense then. |
@z80maniac don't worry I'll propose an approach on one of my next PR. However, I'll need your help because it's quite difficult for me to work on the frontend code. The frontend changes you introduced in this PR is still usable though. |
I don't really care about web UI, but I think that the error handling there requires major changes. In my PR I just kind of hacked the solution so it does not crash at least, but the whole error handling in the web UI is really weird, IMHO. The error is passed as a string, not valid JSON, then JSON is kind of awkwardly extracted from it, but the real JSON is actually inside that JSON, so it gets decoded again... And we end up with this: result.error = JSON.parse(JSON.parse(result.error).content).error; As I said I don't really care about web UI, and don't even use it, so I just left this all this logic intact (just wrote it differently), but maybe something needs to be done here. Or maybe it all works as intended, I don't know. Didn't really look too deep into this. |
This is a rough draft of a possible implementation of an error handling for
server
.Example of an error returned by the server:
id
- error type ID for API clients,description
- text for humans. Thedescription
argument in the constructor is mandatory to force a developer to write some description for errors. Maybe an arbitraryjson
data can be saved too in a separate field for more detailed info, but there was no need for it as of now.server
returned error 404 on errors but that prevented actual JSON to be returned. Instead it just returned a stringFile Not Found
. Now the 404 error is removed. Something probably needs to be done here.While these exceptions will also affect
main
and other programs, it shouldn't change much in terms of backwards compatibility, because these programs will just crash as usual, but with a different exception. An example ofmain
crashing:Currently the following error situations are handled (chosen to represent various places where the error can occur):
Empty grammar (error outside the server code)
Invalid grammar (error outside the server code)
Empty prompt (error while parsing the params)
Invalid request JSON (error before starting a task)
Regarding the "empty prompt" case. I know that it was "fixed" in #5733, but IMHO that was not a proper fix. If an API client passes an empty string, the
server
should not assume that the client actually wanted a space character. API client can pass this character by itself if needed.