server : simplify logic for empty prompts #5953

ggerganov · 2024-03-09T09:46:58Z

Handle BOS-only prompts correctly now:

curl -sS --data '{"n_predict":10, "prompt":""}' http://127.0.0.1:8080/completion | jq .content

" Question: Let d(c) = 4"

z80maniac · 2024-03-09T11:29:59Z

Just to clarify, the server is supposed to generate something for an empty prompt only if it's passed as a string?

For example, passing an empty array as prompt still generates nothing:

❯ curl -sS --data '{"prompt": [], "n_predict": 4}' http://127.0.0.1:8080/completion | jq .content
""

The docs say:

If the prompt is a string or an array with the first element given as a string, a bos token is inserted in the front like main does.

But I'm not sure how to interpret it, and whether it has something to do with handling an empty prompt. Maybe some clarifications are needed in the README.

ggerganov · 2024-03-09T11:56:04Z

Good point - I opened a PR to try to clarify: #5957

Please review

z80maniac · 2024-03-09T13:32:02Z

Seems fine (except for a typo I mentioned there). However, that PR only explains when BOS is added, but the original question remains. Passing an empty array as prompt generates nothing. In this case BOS is not added, but it's not clear from the docs that this will result in an empty response (unless I'm missing something).

On the other hand, maybe it's better to wait until API errors are implemented, then return an error if the inference cannot succeed because of an empty prompt. Just defaulting to empty response is a little bit counter-intuitive, IMHO.

ggerganov · 2024-03-09T13:50:27Z

Passing an empty array as prompt generates nothing. In this case BOS is not added, but it's not clear from the docs that this will result in an empty response (unless I'm missing something).

An empty array does not satisfy the first of the 3 requirements listed, because there isn't a string for first element:

The prompt is a string or an array with the first element given as a string

So it should not prefix a BOS token

ggerganov · 2024-03-09T13:52:07Z

Ah, you mean it's not clear that for empty set of tokens, we return an empty response. Yes, this can be clarified better

server : simplify logic for empty prompts

28dae04

ggerganov mentioned this pull request Mar 9, 2024

server: error handling #5776

Closed

ggerganov merged commit 9674aaf into master Mar 9, 2024
58 of 61 checks passed

ggerganov deleted the gg/server-empty-prompt-logic branch March 9, 2024 10:34

z80maniac mentioned this pull request Mar 9, 2024

Server: format error to json #5961

Merged

hazelnutcloud pushed a commit to hazelnutcloud/llama.cpp that referenced this pull request Mar 10, 2024

server : simplify logic for empty prompts (ggerganov#5953)

9db4c93

NeoZhangJianyu pushed a commit to NeoZhangJianyu/llama.cpp that referenced this pull request Mar 12, 2024

server : simplify logic for empty prompts (ggerganov#5953)

97fde80

jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024

server : simplify logic for empty prompts (ggerganov#5953)

56490cf

hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024

server : simplify logic for empty prompts (ggerganov#5953)

f8b9bb1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server : simplify logic for empty prompts #5953

server : simplify logic for empty prompts #5953

ggerganov commented Mar 9, 2024

z80maniac commented Mar 9, 2024

ggerganov commented Mar 9, 2024

z80maniac commented Mar 9, 2024

ggerganov commented Mar 9, 2024

ggerganov commented Mar 9, 2024

server : simplify logic for empty prompts #5953

server : simplify logic for empty prompts #5953

Conversation

ggerganov commented Mar 9, 2024

z80maniac commented Mar 9, 2024

ggerganov commented Mar 9, 2024

z80maniac commented Mar 9, 2024

ggerganov commented Mar 9, 2024

ggerganov commented Mar 9, 2024