-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhead of creating new threads in Streaming Response #2201
Comments
It's just an example to illustate api usage. You could use a thread pool. There is also async stream api now. |
@hwc0919 What do you mean with async stream API, can you elaborate? |
drogon's internal pool is for non-blocking fast io operations, not for cpu intensive workload or blocking operations. You will block incoming connections if you block those threads. |
@hwc0919 do you mean I should not run cpu intensive workloads in this lambda? This is what I meant |
Yes. If you have lots of cpu intensive workloads that may block the threads, you should run them in another thread pool. For example: app().registerHandler(
"/",
[](const HttpRequestPtr &request,
std::function<void(const HttpResponsePtr &)> &&callback) {
LOG_INFO << "connected:"
<< (request->connected() ? "true" : "false");
trantor::EventLoopThreadPool * pool = ...; // get the pool from where it is allocated
pool->getNextLoop()->queueInLoop([req, callback=std::move(callback)](){
sleep(10); // cpu intensive jobs or blocking operations
auto resp = HttpResponse::newHttpResponse();
resp->setBody("Hello, World!");
callback(resp);
});
},
{Get});
|
This is exactly what I want - number of drogon threads limits concurrently processed requests in my application. After some thread is freed I let another connection in. Is it bad use in regard to drogon architecture? Or is there another reason why I should not use drogon connection thread pool? |
It's bad from all aspects. You should tell the client the server is full, not just let it suspend. What's worse, you are doing it by suspending the whole server. Why not just keep count of the running jobs and reject excessive requests? |
Is your feature request related to a problem? Please describe.
There is an overhead of creating new threads when using streaming response feature.
This drogon example demonstrates it very well:
drogon/examples/async_stream/main.cc
Lines 15 to 24 in 8541e67
We have uncontrolled number of new threads being created which is a bottleneck for high-throughput workloads.
Another issue is that the thread is detached and never joined.
Describe the solution you'd like
Other http frameworks use internal event loop to schedule partial send operation, example:
https://github.com/tensorflow/serving/blob/8eec641dee627243e13f8d04d8980f326c3e371d/tensorflow_serving/util/net_http/server/internal/evhttp_request.cc#L359-L360
Describe alternatives you've considered
No alternative
Additional context
High-throughput scenarios, for example serving LLM text generation with streaming output approach
The text was updated successfully, but these errors were encountered: