Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Healthcheck endpoint? #4746

Closed
JohnGalt1717 opened this issue Jan 3, 2024 · 3 comments · Fixed by #5548
Closed

Healthcheck endpoint? #4746

JohnGalt1717 opened this issue Jan 3, 2024 · 3 comments · Fixed by #5548
Labels
enhancement New feature or request

Comments

@JohnGalt1717
Copy link

JohnGalt1717 commented Jan 3, 2024

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • [ X] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • [ X] I carefully followed the README.md.
  • [ X] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [ X] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Feature Description

I'd like the server to have a /health endpoint.

Motivation

Basically this is for docker so that I can restart the server if it fails automatically and recover given that llama.cpp will crash the process in many cases.

Possible Implementation

/health as a get that will return 200 if it's healthy and obviously timeout if the server isn't available. The best I can do right now is props

Also, the final docker container needs to have curl or wget installed into it, and then documentation updated to show how to go and use the docker-compose functionality to do this.

I'd also like to see /completion and the async version return a 429 error instead of 404 when it is busy as 429 is easily retried but 404 is a not found and thus terminal.

@JohnGalt1717 JohnGalt1717 added the enhancement New feature or request label Jan 3, 2024
@Huge
Copy link

Huge commented Jan 10, 2024

#4853 seems related, no clue why that PR is "closed".

@Celarye
Copy link

Celarye commented Jan 11, 2024

These have been added through #4881 instead.

@bsquizz
Copy link
Contributor

bsquizz commented Jul 25, 2024

There's an issue with the llama-server.Dockerfile -- curl is not installed in the final runtime layer, so the health check can not be run from inside the image. PR open here to fix it: #8693

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants