Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd refuses to boot with cgroups v1 enabled in v256+ (F41+) #1833

Closed
LeoSum8 opened this issue Nov 14, 2024 · 9 comments
Closed

systemd refuses to boot with cgroups v1 enabled in v256+ (F41+) #1833

LeoSum8 opened this issue Nov 14, 2024 · 9 comments

Comments

@LeoSum8
Copy link

LeoSum8 commented Nov 14, 2024

Describe the bug

This morning my FCOS VM was down. I think zincati ran an update.

Upon manually rebooting it, I get the following output via VNC:
Pasted image 20241114095224
Pasted image 20241114095452
Pasted image 20241114095522

Those are the versions installed:
Pasted image 20241114202732

Reverting back to Fedora 40 and everything works again as before.

The FCOS VM is a bhyve guest on a FreeBSD (TrueNAS) Host. The guest filesystem resides in a Zvol.

Reproduction steps

Expected behavior

Boot F41 after update.

Actual behavior

Booting F41 fails, need to revert to F40

System details

FCOS as Bhyve-VM on a FreeBSD 13.1-RELEASE-p9 (TrueNAS) host.
FCOS version before update: 40.20241019.3.0 (working)
FCOS version after update: 41.20241027.3.0 (not working)

Butane or Ignition config

No response

Additional information

Initial discussion at https://discussion.fedoraproject.org/t/fcos-vm-bhyve-on-truenas-failing-to-boot-after-automatic-update-systemd-1-freezing-execution/136788/3

@dustymabe
Copy link
Member

dustymabe commented Nov 14, 2024

what kernel arguments do you have set for this system?

this looks vaguely like the issue where systemd won't boot with cgroups V1 but I don't see that message in your output you show.

The message there, when systemd.unified_cgroup_hierarchy=0 is set looks like this:

[    1.853546] Run /init as init process
[    1.882094] fuse: init (API version 7.40)
[    1.885715] virtiofs virtio4: virtio_fs_setup_dax: No cache capability
[    1.896835] virtiofs virtio5: virtio_fs_setup_dax: No cache capability
[!!!!!!] Refusing to run under cgroup v1, SY… specified on kernel command line.
[    1.905285] systemd[1]: Freezing execution.

@stackrainbow
Copy link

I can confirm also affected on a large amount of nodes, auto updates on, cgroups V1 enabled.
Was a little difficult to figure out what was going on apart from just rollbacking, there's no notice or information that cgroups V1 causes this apart from this issue being linked from OP's thread.

@LeoSum8
Copy link
Author

LeoSum8 commented Nov 15, 2024

I am not 100% sure what I am doing here, but judging by the output of cat /proc/cgroups I think I am on V1, right?

me@coreos:~$ cat /proc/cgroups 
#subsys_name	hierarchy	num_cgroups	enabled
cpuset	11	47	1
cpu	3	108	1
cpuacct	3	108	1
blkio	4	108	1
memory	8	169	1
devices	7	110	1
freezer	12	47	1
net_cls	2	47	1
perf_event	10	47	1
net_prio	2	47	1
hugetlb	6	47	1
pids	5	115	1
rdma	9	47	1
misc	13	47	1

Does the F41 update switch to v2 which ends up braking me or is F41 incompatible with v1?
In the latter case, would a manual switch to v2 before updating again solve the issue?

Edit: Ok, so according to @dustymabe in the linked issue, I need to switch before upgrading

We should mention to our users that they must switch to cgroups v2 before the update to F41 occurs.

@LeoSum8
Copy link
Author

LeoSum8 commented Nov 15, 2024

Hint: the docs still show a way how to stay on v1: https://docs.fedoraproject.org/en-US/fedora-coreos/kernel-args/#_example_staying_on_cgroups_v1

This should be removed or at least a clear hint that this will break F41 should be added.

@travier
Copy link
Member

travier commented Nov 15, 2024

@travier
Copy link
Member

travier commented Nov 15, 2024

We highlighted the cgroups v1 boot failure in the announcement: https://lists.fedoraproject.org/archives/list/[email protected]/thread/ALBBLBHA37S3YJCA4RDLO6UHNR4DY4MH/.

We recommend subscribing to the coreos-status mailing list as the volume should be really low and it includes announcements for those major changes.

@dustymabe
Copy link
Member

dustymabe commented Nov 15, 2024

There was also a console login helper message warning that you should have seen on your systems for years:

There are a few ways to make sure your critical infrastructure stays unbroken on updates:

  1. Following the coreos-status mailing list
  2. Run next and testing stream nodes to preview what is coming
  3. Participate in our Fedora major release Test days (announced on our devel mailing list)

Here is an example where an issue was found in next by a community member and it was fixed before F41 was promoted to stable.

@dustymabe dustymabe changed the title FCOS VM (Bhyve on TrueNAS) failing to boot after automatic update to Fedora 41 (“systemd[1]: Freezing execution”) systemd refuses to boot with cgroups v1 enabled in v256+ (F41+) Nov 15, 2024
@LeoSum8
Copy link
Author

LeoSum8 commented Nov 15, 2024

Thanks for the info!
I am not complaining at all here and don't feel underinformed. Also, my system is not at all critical.
In the contrary, I am amazed how quickly I am helped here by you guys.
The hint towards the "staying on cgroups v1 example" was not meant as criticism.

@LeoSum8
Copy link
Author

LeoSum8 commented Nov 15, 2024

I successfully switched to V2 now with sudo rpm-ostree kargs --delete=systemd.unified_cgroup_hierarchy --reboot and after reenabling zincati, it successfully updates to F41.
Thank you guys!
And indeed I have been ignoring that warning for a few months. I just really read it for the first time now.
So I do know who to blame ;)

@LeoSum8 LeoSum8 closed this as completed Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants