Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use DynamicUser=yes for all cockpit components #16811

Merged

Conversation

allisonkarlitskaya
Copy link
Member

@allisonkarlitskaya allisonkarlitskaya commented Jan 11, 2022

Switch over to systemd allocating user IDs for us dynamically, dropping our static users.

We create these users dynamically:

  • cockpit-session-socket is the group owner of /run/cockpit/session
  • cockpit-wsinstance-socket is the group owner of all sockets in /run/cockpit/wsinstance/
  • cockpit-tls gets its own user
  • cockpit-wsinstance-http{,s@}.service each get a user. In this case, I think the separate https instances even each get a separate user.

This is a bit complicated due to systemd/systemd#23067

@allisonkarlitskaya allisonkarlitskaya added the blocked Don't land until something else happens first (see task list) label Jan 11, 2022
@allisonkarlitskaya allisonkarlitskaya temporarily deployed to cockpit-dist January 11, 2022 08:06 Inactive
@martinpitt martinpitt removed their request for review April 7, 2022 09:02
@martinpitt
Copy link
Member

I filed systemd/systemd#23067 to start the discussion.

@martinpitt
Copy link
Member

martinpitt commented May 10, 2024

Rebased after the rebasing of #16808 and the recent heavy changes to sysuser management. Locally this at least works for a simple local session, let's see if there's any fallout beyond what's already in #16808.

cockpit-wsinstance-http{,s@}.service each get a user.

I didn't port this bit yet, let's do that step separately. We have much more freedom to change this once everything is a DynamicUser.

@martinpitt martinpitt added the no-test For doc/workflow changes, or experiments which don't need a full CI run, label May 10, 2024
@martinpitt
Copy link
Member

Cool -- the test failures are by and large the same as in #16808. Just the unit lifecycle test fails in addition, but that'll be straightforward to fix -- and indeed we should adjust this some more to check proper cleanup after stopping.

@cgwalters
Copy link
Contributor

xref containers/bootc#870 too here

@martinpitt

This comment was marked as outdated.

@martinpitt martinpitt force-pushed the dynamic-users branch 2 times, most recently from cd54812 to 2ad8273 Compare November 17, 2024 08:54
@martinpitt

This comment was marked as outdated.

@martinpitt martinpitt force-pushed the dynamic-users branch 2 times, most recently from 70f0d85 to 165d0b6 Compare November 21, 2024 13:35
Copy link
Member

@martinpitt martinpitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@allisonkarlitskaya This looks good in general (albeit annoyingly complicated, but that systemd request didn't go anywhere). There are two failures still which perfectly reproduce locally.

Now, do you want me to work on this (like #16808 reverse ownership/review 😁 ) or want to do this yourself?

@@ -0,0 +1,11 @@
[Unit]
Description=Dynamic user for /run/cockpit/session socket
BindsTo=cockpit.service
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually it may make more sense to bind to cockpit-session.socket here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That actually explains this failure -- The test only wants to start cockpit-session.socket and run its own cockpit-ws process. Like this, starting the session socket also starts cockpit.socket, which then fails due to port 9090 being busy already.

@martinpitt martinpitt removed the blocked Don't land until something else happens first (see task list) label Nov 21, 2024
@martinpitt martinpitt removed the no-test For doc/workflow changes, or experiments which don't need a full CI run, label Nov 21, 2024
martinpitt and others added 5 commits November 21, 2024 20:13
postinst scripts can also be called in "abort-upgrade" and
"abort-remove" mode. Don't clean up the statoverride in these cases.

Fixes commit 98f3d6f.
Unfortunately, socket units can't set DynamicUser=yes [1], so add a
dependency on a separate .service created just for this purpose.

/run/cockpit/session is now owned by group cockpit-session-socket which
also gets added as a supplementary group to the cockpit-ws units.

Adjust the proxy tests accordingly.

[1] systemd/systemd#23067
Similar to the last commit, we create a dynamic group for the sockets in
/run/cockpit/wsinstance and add a supplementary group to cockpit-tls.
We only dynamically generate a single .socket file now.  The rest of
them are in version control and installed verbatim.
Clean it up on package upgrades. Unlike the old `cockpit-ws` user, this
user never owned any files on disk (other than cockpit-system), so this
is safe.

Co-Authored-By: Martin Pitt <[email protected]>
@martinpitt
Copy link
Member

Rebased, fixed the above systemd unit issue, adjusted the proxy tests, and added the cleanup of the obsolete static system user. That was pleasantly straightforward!

@martinpitt martinpitt marked this pull request as ready for review November 21, 2024 19:24
Copy link
Member

@martinpitt martinpitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is quite a complex zoo of systemd units, it is rather straightforward and adequately easy to understand (at least for me). The benefits are definitively worth it, so let's do this!

@allisonkarlitskaya I dabbled with this (minor unit changes and test adjustments, and the package postinst cleanups), so please cross-check.

@allisonkarlitskaya allisonkarlitskaya added the release-blocker Targetted for next release label Nov 22, 2024
Copy link
Member Author

@allisonkarlitskaya allisonkarlitskaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this again, I think this is quite fine.

I see some cleanup potential here (comments below), but since there's no longer any static state on the system, we can easily change those in the future. And arguably, they're unrelated changes.

@@ -4,8 +4,8 @@ Documentation=man:cockpit-ws(8)
Requires=cockpit.socket
Requires=cockpit-wsinstance-http.socket cockpit-wsinstance-https-factory.socket
# ensure our DynamicUser exists
Requires=cockpit-ws-user.service
After=cockpit-ws-user.service
Requires=cockpit-ws-user.service cockpit-wsinstance-socket-user.service
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this, even as someone who understands how this pile of software works, is confusing.

We've used cockpit-ws for cockpit-tls and cockpit-wsinstance for cockpit-ws for a while now for reasons that are no longer relevant. I think we should take the time to change these usernames: cockpit-tls for cockpit-tls and cockpit-ws for cockpit-ws.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be careful about "cockpit-ws", as we used to have that static user, and /etc/cockpit/ws-certs.d/* may still be owned by it on old systems. In general, +1 for cleaning up the user names, though!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For that reason I added postinst scripts which remove the cockpit-wsinstance user on upgrade, but not for cockpit-ws. I don't want to get into the business of randomly chowning files in /etc. You can just too easily screw up shared certificates, backups, break Ansible scripts, etc.

@@ -17,6 +17,7 @@ ExecStartPre=+@libexecdir@/cockpit-certificate-ensure --for-cockpit-tls
ExecStart=@libexecdir@/cockpit-tls
User=cockpit-ws
Group=cockpit-ws
SupplementaryGroups=cockpit-wsinstance-socket
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And ditto. We should have cockpit-ws-socket and cockpit-session-socket.

@@ -332,7 +332,7 @@ class TestConnection(testlib.MachineCase):
# number of https instances is bounded (DoS prevention)
# with MaxTasks=200 und 2 threads per ws instance we should have a
# rough limit of 100 instances, so at some point curl should start failing
m.execute("runuser -u cockpit-ws -- sh -ec 'RC=1; for i in `seq 120`; do "
m.execute("runuser -u cockpit-wsinstance-socket -- sh -ec 'RC=1; for i in `seq 120`; do "
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurs to me that if we ran this as cockpit-ws (ie: the account running cockpit-tls) it wouldn't work because the supplementary group isn't on that account, but rather comes in via an additional option in the unit. Can we move the supplementary group to the unit that dynamically creates the user? That might make things a bit easier to understand, and would allow us to drop this patch fragment...

@allisonkarlitskaya allisonkarlitskaya merged commit 2ff19fa into cockpit-project:main Nov 22, 2024
87 checks passed
@allisonkarlitskaya allisonkarlitskaya deleted the dynamic-users branch November 22, 2024 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-blocker Targetted for next release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants