Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server : simplify logic for empty prompts #5953

Merged
merged 1 commit into from
Mar 9, 2024

Conversation

ggerganov
Copy link
Owner

ref #5776

Handle BOS-only prompts correctly now:

curl -sS --data '{"n_predict":10, "prompt":""}' http://127.0.0.1:8080/completion | jq .content

" Question: Let d(c) = 4"

@ggerganov ggerganov mentioned this pull request Mar 9, 2024
@ggerganov ggerganov merged commit 9674aaf into master Mar 9, 2024
58 of 61 checks passed
@ggerganov ggerganov deleted the gg/server-empty-prompt-logic branch March 9, 2024 10:34
@z80maniac
Copy link
Contributor

Just to clarify, the server is supposed to generate something for an empty prompt only if it's passed as a string?

For example, passing an empty array as prompt still generates nothing:

❯ curl -sS --data '{"prompt": [], "n_predict": 4}' http://127.0.0.1:8080/completion | jq .content
""

The docs say:

If the prompt is a string or an array with the first element given as a string, a bos token is inserted in the front like main does.

But I'm not sure how to interpret it, and whether it has something to do with handling an empty prompt. Maybe some clarifications are needed in the README.

@ggerganov
Copy link
Owner Author

Good point - I opened a PR to try to clarify: #5957

Please review

@z80maniac
Copy link
Contributor

Seems fine (except for a typo I mentioned there). However, that PR only explains when BOS is added, but the original question remains. Passing an empty array as prompt generates nothing. In this case BOS is not added, but it's not clear from the docs that this will result in an empty response (unless I'm missing something).

On the other hand, maybe it's better to wait until API errors are implemented, then return an error if the inference cannot succeed because of an empty prompt. Just defaulting to empty response is a little bit counter-intuitive, IMHO.

@ggerganov
Copy link
Owner Author

Passing an empty array as prompt generates nothing. In this case BOS is not added, but it's not clear from the docs that this will result in an empty response (unless I'm missing something).

An empty array does not satisfy the first of the 3 requirements listed, because there isn't a string for first element:

  • The prompt is a string or an array with the first element given as a string

So it should not prefix a BOS token

@ggerganov
Copy link
Owner Author

Ah, you mean it's not clear that for empty set of tokens, we return an empty response. Yes, this can be clarified better

hazelnutcloud pushed a commit to hazelnutcloud/llama.cpp that referenced this pull request Mar 10, 2024
NeoZhangJianyu pushed a commit to NeoZhangJianyu/llama.cpp that referenced this pull request Mar 12, 2024
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants