-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High CPU usage on system time change #89
Comments
Hi @jetomit! Using Instead we should probably change @wingo, WDYT? Thanks! |
@jetomit Here's a proposed workaround on the Guix side: https://issues.guix.gnu.org/64966 |
This would work for aarch64, but I also encounter this issue on armhf and x86_64 systems. This happens whenever system time is pushed forward by a significant amount (a day or more), either by ntpd or manually. As I understand it, guile’s |
Another report of |
@wingo Hello! Did you have a chance to look into that? I'd be happy to try and implement any suggestions you might have (I'd love to do that before Shepherd 1.0 is out). |
Took a look at it but it requires a bit of concentration to not introduce bugs :) Do have a look if you like! |
Just posting for the records another example "in the wild" of someone working around this issue. https://issues.guix.gnu.org/70892#3 Thanks for all your hard work! 😄 |
Apologies if anything is inaccurate in this comment, but I had a look through the timer code to see if I could think of anything. I still don't fully grok the control flow, but from a cursory read it does look like I wonder if a solution to this might be to skip iterating through the inner wheels if:
This feels like it should allow you to pretty rapidly "warp" through a large space of mostly unset timers (by climbing up the hierarchy and skipping slots at the lowest granularity possible), which is what you would expect to see if the clock has suddenly jumped years into the future. |
Since Guix upgraded to guile-fibers 1.3.1, shepherd hangs shortly after boot on systems without a RTC. I believe the problem comes from using
get-internal-real-time
in the guile-fibers timer wheel implementation. After NTP corrects the system time, this function returns a much larger value, and the CPU load (for one core) goes to 100%.Profiling suggests the process spends the CPU time in
timer-wheel-advance!
, so I imagine it is trying to tick through a five-year time diff. I tried increasing the system time manually by N days, which causes shepherd to be unresponsive (e.g. toherd status
) for about N×5 seconds. I observed similar behavior with the example from guile-fibers readme.Replacing all instances of
(get-internal-real-time)
with(clock-gettime 1)
in guile-fibers, and reconfiguring the system with the patched package, fixes this problem. I think using a monotonic clock makes sense, but there is probably a cleaner / more portable way to do it.Thanks!
The text was updated successfully, but these errors were encountered: