Why can't multiple apis be triggered at the same time #873

zhengzhanpeng · 2023-11-05T12:00:47Z

This is an area that has been a headache for me, and I hope someone can answer me

tranhoangnguyen03 · 2023-11-08T20:16:01Z

Do you mean concurrent requests?

zpzheng · 2023-11-17T08:59:55Z

Do you mean concurrent requests?

yes

abetlen · 2023-11-23T06:19:15Z

@zhengzhanpeng this currently under development in #771

isaac-chung · 2024-04-08T07:33:07Z

@abetlen I noticed that llama.cpp's server supports concurrent requests and continuous batching as well https://github.com/ggerganov/llama.cpp/tree/master/examples/server. To enable that for this library, would it be as straightforward as exposing the relevant command line options? Or am I missing something obvious?

abetlen added the question Further information is requested label Nov 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why can't multiple apis be triggered at the same time #873

Why can't multiple apis be triggered at the same time #873

zhengzhanpeng commented Nov 5, 2023

tranhoangnguyen03 commented Nov 8, 2023

zpzheng commented Nov 17, 2023

abetlen commented Nov 23, 2023

isaac-chung commented Apr 8, 2024

Why can't multiple apis be triggered at the same time #873

Why can't multiple apis be triggered at the same time #873

Comments

zhengzhanpeng commented Nov 5, 2023

tranhoangnguyen03 commented Nov 8, 2023

zpzheng commented Nov 17, 2023

abetlen commented Nov 23, 2023

isaac-chung commented Apr 8, 2024