Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: handle request parameter best_of=1 #44

Open
1 task done
tstescoTT opened this issue Dec 13, 2024 · 1 comment
Open
1 task done

[Feature]: handle request parameter best_of=1 #44

tstescoTT opened this issue Dec 13, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@tstescoTT
Copy link

🚀 The feature, motivation and pitch

Currently we do not support best_of parameter and setting it in the request will fail validation.

The sampling arg best_of appears to be handled as an int in client side code, e.g.:
https://github.com/tenstorrent/vllm/blob/dev/benchmarks/benchmark_serving.py#L784
so it gets set to 1 when turned off in some cases like this.

Ideally the default handling of best_of = 1 should be the same as best_of = None https://github.com/tenstorrent/vllm/blob/dev/vllm/sampling_params.py#L291.

Alternatives

No response

Additional context

Currently I need to patch the benchmarking script as a work around.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@tstescoTT
Copy link
Author

Also supporting default None or nop values for logprobs would similarly be useful and expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant