-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce CPU usage when idle #775
base: master
Are you sure you want to change the base?
Conversation
Code Metrics Report=============================================================================== Language Files Lines Code Comments Blanks =============================================================================== C Header 2 35 28 0 7 Dockerfile 1 34 25 0 9 Happy 1 442 369 0 73 JSON 12 105 104 0 1 Python 52 2268 1930 69 269 TOML 20 625 559 2 64 YAML 2 21 19 2 0 ------------------------------------------------------------------------------- Jupyter Notebooks 4 0 0 0 0 |- Markdown 2 77 32 31 14 |- Python 2 196 169 1 26 (Total) 273 201 32 40 ------------------------------------------------------------------------------- Markdown 38 2765 0 2098 667 |- BASH 6 103 100 0 3 |- JSON 1 12 12 0 0 |- Python 5 92 82 0 10 |- Rust 9 322 274 0 48 |- TOML 2 75 63 0 12 (Total) 3369 531 2098 740 ------------------------------------------------------------------------------- Rust 260 75712 68239 1548 5925 |- Markdown 123 1217 25 1117 75 (Total) 76929 68264 2665 6000 =============================================================================== Total 393 82007 71273 3719 7015 =============================================================================== |
@EricLBuehler I switched to using |
|
||
self.scheduler.free_finished_sequence_groups(); | ||
if self.scheduler.waiting_len() == 0 { | ||
tokio::task::yield_now().await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of the tokio::select!
addition! I'm just a bit confused about how this function would implement what we want (that is, to not sit in a loop). If I understand correctly, it yields to tokio's runtime, which means we wait for the other arm of the tokio::select!
(the request recieve arm) to match?
Perhaps you could add a comment here explaining what the logic/flow is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You understand my proposed flow correctly. My understanding of the reason we no longer sit in the loop anymore is mostly because of the yield_now
; it basically marks the task as Pending
for one iteration of Tokio's runtime. On the next iteration, yield_now
will marked Ready
and the loop can continue. Yielding doesn't take much time, but is enough to lower CPU usage dramatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EricLBuehler Thoughts on this? Happy to close this and rethink if this change doesn't make sense to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scottwey I think after this change and if you could do some testing of it, this should be good to merge!
Finally, if you could drop some rough metrics on CPU usage before vs after it would be great too.
Co-authored-by: Eric Buehler <[email protected]>
@EricLBuehler Otherwise, I will get the conflicts sorted and await further review. |
@scottwey thanks you for the testing! Great to see the 100% -> ~1% utilization :). If you could resolve the conflicts, we can absolutely merge. |
Currently, the tight loop in
Engine
causes very high single core CPU usage when idle. This is also not great because this is long-running blocking code running inside of an async task, blocking an async worker entirely. On systems with lower core count, this will probably impact performance fairly negatively.From my testing, this change dramatically drops CPU usage with minimal impact to performance, although I have not had a chance to benchmark properly.