You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[ X] I reviewed the Discussions, and have a new bug or useful enhancement to share.
Feature Description
I'd like the server to have a /health endpoint.
Motivation
Basically this is for docker so that I can restart the server if it fails automatically and recover given that llama.cpp will crash the process in many cases.
Possible Implementation
/health as a get that will return 200 if it's healthy and obviously timeout if the server isn't available. The best I can do right now is props
Also, the final docker container needs to have curl or wget installed into it, and then documentation updated to show how to go and use the docker-compose functionality to do this.
I'd also like to see /completion and the async version return a 429 error instead of 404 when it is busy as 429 is easily retried but 404 is a not found and thus terminal.
The text was updated successfully, but these errors were encountered:
There's an issue with the llama-server.Dockerfile -- curl is not installed in the final runtime layer, so the health check can not be run from inside the image. PR open here to fix it: #8693
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Feature Description
I'd like the server to have a /health endpoint.
Motivation
Basically this is for docker so that I can restart the server if it fails automatically and recover given that llama.cpp will crash the process in many cases.
Possible Implementation
/health as a get that will return 200 if it's healthy and obviously timeout if the server isn't available. The best I can do right now is props
Also, the final docker container needs to have curl or wget installed into it, and then documentation updated to show how to go and use the docker-compose functionality to do this.
I'd also like to see /completion and the async version return a 429 error instead of 404 when it is busy as 429 is easily retried but 404 is a not found and thus terminal.
The text was updated successfully, but these errors were encountered: