Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cgroup-v2 considerations #109

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

wenningerk
Copy link

  • prevent possible lockup when format in proc changes
  • properly get and handle scheduler policy & prio
  • recognize and try to handle cgroup-v2 similarly
  • on SCHED_RR failing push to the max with SCHED_OTHER

Just as a preview ...
Needs splitting probably.
And the cgroup-v2 stuff is ugly:

  • scanning /proc/sched_debug seems to be the only easy way to find
    out about CONFIG_RT_GROUP_SCHED being enabled with cgroup-v2
  • currently (as of 5.4.20) there is no hierarchical rt-budget and
    so moving to the root-slice in all cases with all consequences
  • when moving to the root-slice journal stops working
  • auto and yes for SBD_MOVE_TO_ROOT_CGROUP are behaving the same

@kgaillot
Copy link

Code-wise it looks reasonable, though I'm not familiar with either cgroup implementation and didn't do any testing. Spelling: "budged" in a couple of places.

It's probably worthwhile to comment, either in the sysconfig file or the code, the conditions under which cgroup v2 will be effective. I.e. what kernel version made it available and what has to be done to switch to it, and how a user could tell what an existing system uses.

@wenningerk
Copy link
Author

wenningerk commented Feb 28, 2020

It's probably worthwhile to comment, either in the sysconfig file or the code, the conditions under which cgroup v2 will be effective. I.e. what kernel version made it available and what has to be done to switch to it, and how a user could tell what an existing system uses.

Tried to be a bit more descriptive in the comment before the code that is actually doing the check.
As it is there for a while in the kernel and both can be configured I guess going into kernel-versions that would provide some version of cgroup-v2 doesn't make much sense.
Fedora 31 seems to be the first distribution using cgroup-v2 by default and although it should be possible I didn't play with switching back and forth. Asking for trouble probably. Effort here is more to live with it if it is there.
Even with cgroup-v2 enabled in as in Fedora 31 up to now approaches shouldn't run into issues as long as CONFIG_RT_GROUP_SCHED isn't enabled as moving to root-slice is not needed.
Both sbd and corosync will first check for non existent /sys/fs/cgroup/cpu/cpu.rt_runtime_us and be happy.
To play with, an otherwise Fedora 31 kernel with CONFIG_RT_GROUP_SCHED enabled can be found under https://koji.fedoraproject.org/koji/taskinfo?taskID=41654832 (don't know when it would be cleaned up).

@jfriesse
Copy link
Member

jfriesse commented Feb 28, 2020

Looks reasonable (a bit scary tho) but I have a question. What you mean by "when moving to the root-slice journal stops working"? It's logging to journald or some other journal (sbd, fs, ...)?

@wenningerk
Copy link
Author

Looks reasonable (a bit scary tho) but I have a question. What you mean by "when moving to the root-slice journal stops working"? It's logging to journald or some sbd journal?

logging stops to work unfortunately. If it was something sbd internal I would have tried to make it work ;-)
no idea if it is just that (bad enough but we would have logging in a file as well) or if there are other issues. Anyway stopping via the cgroup is probably not working with all that root-slice switching - which is why I try to prevent it whenever possible.

@jfriesse
Copy link
Member

Looks reasonable (a bit scary tho) but I have a question. What you mean by "when moving to the root-slice journal stops working"? It's logging to journald or some sbd journal?

logging stops to work unfortunately. If it was something sbd internal I would have tried to make it work ;-)
no idea if it is just that (bad enough but we would have logging in a file as well) or if there are other issues. Anyway stopping via the cgroup is probably not working with all that root-slice switching - which is why I try to prevent it whenever possible.

Ok, thanks for the info.

@wenningerk wenningerk force-pushed the cgroup2 branch 2 times, most recently from 2ec37a2 to 763e3a0 Compare March 2, 2020 13:46
@wenningerk
Copy link
Author

cherry-picked the travis-config changes needed for mock 2.0 (update in fedora-31) as they are not really related to the topic of this PR.
Split off the scheduler-config stuff that isn't actually cgroup-v2 related.
Guess it should be OK to cherry-pick that into master as well as it should fix a possible hang-situation when /proc-content changes with some kernel-version & it makes behavior more similar with what corosync is doing (fall back to raising prio to the max within SCHED_OTHER if switch to SCHED_RR is failing).

@wenningerk wenningerk changed the title Fix: scheduling: overhaul the whole thing cgroup-v2 considerations Mar 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants