-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance in CPU-bound programs #622
Comments
We've also faced this problem. While testing fairness of large data stream processing with httpaf we observed that one stream hogs all of the processing power and all other streams just time out. Probably due to data coming at high rate, read loop keeps reading from the socket without blocking (and thus invoking event loop). We just added We're thinking about yielding once per certain amount of bytes read, but that looks a bit weird to solve at application level. This issue has libuv milestone, does that mean that implementing some heuristic within current Lwt is not considered viable? Does application-level workaround have any drawbacks compared to heuristic within Lwt itself? |
You may also find lwt/src/unix/lwt_unix.cppo.mli Lines 58 to 65 in 336566d
This is indeed best solved inside the scheduler. The only reason for the libuv milestone is that until now, the only places I had observed this issue, that definitely required an in-library fix, were related to some libuv work I was doing (in repromise_lwt and luv). Do you have time to work on this in Lwt? If not, could you share your test/benchmark so I can use it to measure effects of various approaches, when I work on this (slightly later)? |
I'll try using As of benchmark, basically we just replicate --- a/examples/lib/httpaf_examples.ml
+++ b/examples/lib/httpaf_examples.ml
@@ -39,8 +39,9 @@ module Server = struct
let request_body = Reqd.request_body reqd in
let response_body = Reqd.respond_with_streaming reqd response in
let rec on_read buffer ~off ~len =
- Body.write_bigstring response_body buffer ~off ~len;
- Body.schedule_read request_body ~on_eof ~on_read;
+ Body.schedule_bigstring response_body buffer ~off ~len;
+ Body.flush response_body (fun () ->
+ Body.schedule_read request_body ~on_eof ~on_read);
and on_eof () =
Body.close_writer response_body
in Clients are simple curl launches like this:
You might need to add header specifying chunked-encoding response, but it should work without it as well. I might find some time in future to try implementing this in Lwt, depending on mitigation effect from |
Looks like
|
But changing yield interval to each X bytes read/written works better! Yielding each megabyte gives nearly the same perf as without yields, but multiple streams share the bandwidth fairly. |
Ok that's good :) |
@Lupus, I guess another library solution to your case would be to add a variant of |
Yeah, that should probably work as well. When all of your sockets always have data to read, you don't need event loop iteration :) On the other hand when there's a slow connection happening along with a fast one, there won't be any fairness in this scenario, fast one will hog CPU if it uses "yield only to other CPU guys" strategy. |
Some CPU-bound Lwt programs call
Lwt.pause ()
a lot, in order to yield to any potential I/O that may be pending.However, doing this results in Lwt calling
select
/kevent
/epoll
at the rate thatLwt.pause ()
is called. This forces the Lwt user to worry about Lwt implementation details, and to think about how often they are callingLwt.pause ()
.We should probably have Lwt adapt automatically by checking if polling I/O the last time actually resolved any promises, and, if not, skipping calls to
select
, etc., on future scheduler iterations, with some kind of limited exponential backoff, or a similar scheme.See https://discuss.ocaml.org/t/2567/5. This will also improve performance of CPU-bound Repromise programs.
cc @kennetpostigo
The text was updated successfully, but these errors were encountered: