Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server : add some missing env variables #9116

Merged
merged 3 commits into from
Aug 27, 2024

Conversation

ngxson
Copy link
Collaborator

@ngxson ngxson commented Aug 21, 2024

Cont #9105

I forgot LLAMA_ARG_HOST and LLAMA_ARG_PORT

As a nice-to-have, LLAMA_ARG_HF_REPO and LLAMA_ARG_MODEL_URL are also added. Although it's not used by HF inference endpoint, it will be useful if someone want to deploy llama.cpp to stateless/server-less platforms like heroku or google cloud run.


@ngxson ngxson requested a review from ggerganov August 21, 2024 09:57
@github-actions github-actions bot added examples devops improvements to build systems and github actions server labels Aug 21, 2024
@Nexesenex
Copy link
Contributor

This overall feature is very useful!
Would it be possible to add params.rope_scaling_type and the other rope related parameters?

@ngxson
Copy link
Collaborator Author

ngxson commented Aug 24, 2024

@Nexesenex Currently we can't pass enum as environment variable, so for now I can't add rope_scaling_type.

The hacky solution is to duplicate the code from gpt_params_find_arg, but I don't feel like it's worth doing so. Probably there will be a follow-up refactoring PR in the future to bring more variables to env.

@Nexesenex
Copy link
Contributor

@ngxson I tried and reached the problem, hence my request.
Thanks for the hacky hint! I will try to implement it for myself for the time being.

@ngxson ngxson merged commit a77feb5 into ggerganov:master Aug 27, 2024
52 checks passed
dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this pull request Oct 29, 2024
* server : add some missing env variables

* add LLAMA_ARG_HOST to server dockerfile

* also add LLAMA_ARG_CONT_BATCHING
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
* server : add some missing env variables

* add LLAMA_ARG_HOST to server dockerfile

* also add LLAMA_ARG_CONT_BATCHING
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
* server : add some missing env variables

* add LLAMA_ARG_HOST to server dockerfile

* also add LLAMA_ARG_CONT_BATCHING
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops improvements to build systems and github actions examples server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants