Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bib: switch to use bootc install to-filesystem (HMS-3453) #304

Closed
wants to merge 8 commits into from

Conversation

mvo5
Copy link
Collaborator

@mvo5 mvo5 commented Mar 22, 2024

This PR switches bib to the version of the images library from PR osbuild/images#462 that defaults to use bootc install to-filesystem. Because some things will work differently in this model this PR touches a lot of things:

  1. It removes customizations and the -config option as we currently have only very limited support for them in the new world (for the iso the customizations are still supported via --iso-config)
  2. It updates the README to have a new default example so that users can still play around and login into a generated image (should probably add an example for --iso-config there now though)
  3. switches the testing to use the bootc -root-authorized-keys support instead of the previous passwd based testing

This is ready now but I expect test failures until containers/bootc#407 is available in our tested containers.

@cgwalters
Copy link
Contributor

cgwalters commented Mar 22, 2024

It removes customizations and the -config option as we currently have only very limited support for them in the new world

Hmmm...I know I keep saying this but it seems good to try to keep at least some of this. I haven't dug into the details of the relevant pipelines but as far as I am aware it should be logically equivalent to basically do:

current pipeline

  • create filesystems
  • run ostree stages
  • run blueprint/kickstart type stuff in a container targeting new deployment

this pipeline

  • create filesystems
  • run bootc install to-filesystem
  • run same blueprint/kickstart type stuff

Is the problem here mainly that we've lost an ergonomic way to do "find default deployment root" (subject of some debate in osbuild/osbuild). That seems pretty easy to fix if so...something like:

status=$(mktemp)
bootc install to-filesystem --output-status-json=$status ...

And our serialized status would have the relative path to deployment.

Or hmmm...maybe actually bootc install itself could help spawn a container in the target root itself? That probably makes a lot of sense actually, as it would help enforce things like only writing to /etc and /var or so?

@cgwalters
Copy link
Contributor

cgwalters commented Mar 22, 2024

Or maybe a bootc install mount? i.e.:

  • <create filesystems using whatever>
  • bootc install to-filesystem /path/to/fs
  • tmproot=$(mktemp -d)
  • bootc install mount-initial-root /path/to/fs ${tmproot}

Then one could just directly write to ${tmproot}/etc and ${tmproot}/var for example?

(Side note to self...we would want to detect the etc-transient flag when dealing with etc in this flow)

@mvo5 mvo5 changed the title bib: switch to use bootc install to-filesystem bib: switch to use bootc install to-filesystem (HMS-3453) Mar 25, 2024
@mvo5
Copy link
Collaborator Author

mvo5 commented Mar 26, 2024

It removes customizations and the -config option as we currently have only very limited support for them in the new world
[..]
* run bootc install to-filesystem

* run same blueprint/kickstart type stuff

Is the problem here mainly that we've lost an ergonomic way to do "find default deployment root" (subject of some debate in osbuild/osbuild). That seems pretty easy to fix if so...something like:
[..]

Thank you for your thoughts on this! The issue so far was that I had the misconception that we should not write to the deployment root directly. We have ways in osbuild to find the default deployment since osbuild/osbuild#1553 so that should be straightforward. So it seems we could re-add customizations via:

  1. write our customization to //ostree/deploy/default/deploy/.0/etc/*
  2. ensure correct selinux labeling for all the files we touch

If that sounds correct we could look into this in a followup. Unfortunately the main thing we need to re-add would be users so we need to carefully explore this - I would love to look into json user records for this, but it needs at least a configuration change to sshd_config [0]. There is the extra complication that it does not solve the issue of clashing UIDs, i.e. if there is a user created by bib with uid=1000 and the bootc-image (for whatever reason) has files belonging to UID=1000 (for a different username). But that is something we can/could detect and just error (not on day-2 though, that would need support from bootc).

Alternatively we could just keep using the org.osbuild.ostree.passwd stage [1] (which is what we do today).

Thoughts very welcome - but that all sounds like followup material?

[0] https://www.freedesktop.org/software/systemd/man/latest/userdbctl.html#Integration%20with%20SSH
[1] https://github.com/osbuild/osbuild/blob/main/stages/org.osbuild.ostree.passwd#L29

P.S. There is of course the issue with passwordless ssh in systemd-home systemd/systemd#17300 that may be a problem here too

@mvo5 mvo5 force-pushed the switch-to-bootc-install-to-fs branch from 84fe5a4 to 13b7143 Compare March 26, 2024 17:57
@mvo5 mvo5 marked this pull request as ready for review March 26, 2024 17:59
@cgwalters
Copy link
Contributor

Yeah, it is possible to create conflicts by configuring users statically both in the disk image and container image today. (btw I did https://containers.github.io/bootc/building/users-and-groups.html which...doesn't explicitly touch on this but should)

I think we have a design question here: When creating users in bib, should it be static (i.e. we create /home/someuser literally in the disk image, or should it actually be done more systemd-sysusers or JSON user record style.

I lean more towards the latter it avoids the physical /etc/passwd conflicts - by having both operate dynamically on boot.

This relates to whether SSH keys are maintained via tmpfiles.d in /etc or injected statically.

This all said, using sysusers for interactive users and JSON user records (or altfiles, or something else) is treading a bit of new ground and it may be simplest to follow the "static" path.

@achilleas-k achilleas-k self-requested a review March 26, 2024 18:17
@mvo5 mvo5 force-pushed the switch-to-bootc-install-to-fs branch 2 times, most recently from 5f1e24b to 1648936 Compare March 26, 2024 20:36
@cgwalters
Copy link
Contributor

I would love to look into json user records for this, but it needs at least a configuration change to sshd_config [0].

It's more than that, systemd-homed isn't enabled in c9s note. (It's...a giant thing with a lot of system implications, so)

@achilleas-k
Copy link
Member

So for this PR, do we hold off until we figure out the customization story, or do we merge it and work on post-deploy customizations separately (in the meantime telling users to do everything in the base container)?

@achilleas-k
Copy link
Member

I'd like it if we could merge it now so we get iterating on any issues that might come up with this change.

mvo5 added 7 commits April 2, 2024 16:35
With the latest version of images we now use the
`bootc install to-filesystem` feature instead of our own ostree
deploy pipeline. This means that most of our customizations need
to be disabled and we need to decide if we will port them in some
form or update our README to encourage the use of derrived containters
for them.
On casual reading `nexp` reads almost like a typo of `next` and
it's generally a bit short. Let's make it very explicit instead :)
With the move to bootc install to-filesystem we cannot currently
customize users except for roots ssh authorized_keys. So make
this the only allowed option. Note that it would be nicer if that
would be symetric to bootc --root-ssh-authorized-keys that takes
a file but our image customizations currently only support a single
key. So start here by marking it experimental to unblock tests and
fix images to eventually be symetric.
This commit removes the -config options from bib. Blueprint customizations
are no longer supported with the move to bootc install to-filesystem.
This is fine but we need to reflect it in our docs.

It also updates the example with an example how to create an image
with an user. This might be slightly controversial but I think it's
rather important to give users a quick path to success (and the
updated README also links to the upstream docs for a more indepth
discussion).
Rename the --config option to --iso-config and error if it is
used for anything other than the ISO image.
@mvo5
Copy link
Collaborator Author

mvo5 commented Apr 2, 2024

Hm, tests fail in GH with:

...
 Installed: "centos/grub.cfg"
Trimming boot
fstrim: /run/osbuild/mounts/boot: FITRIM ioctl failed: Function not implemented
ERROR Installing to filesystem: Task Trimming boot failed: ExitStatus(unix_wait_status(256))
Traceback (most recent call last):
  File "/run/osbuild/bin/org.osbuild.bootc.install-to-filesystem", line 50, in <module>
    r = main(args["options"], args["inputs"], args["paths"])
  File "/run/osbuild/bin/org.osbuild.bootc.install-to-filesystem", line 45, in main
    subprocess.run(pargs, env=env, check=True)
  File "/usr/lib64/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['bootc', 'install', 'to-filesystem', '--source-imgref', 'containers-storage:[overlay@/run/osbuild/containers/storage+/run/containers/storage]65d4b6a5081e0e82e014ea4083e7002cfbc18ee014084063e434f5bc360af85e', '--skip-fetch-check', '--generic-image', '--root-ssh-authorized-keys', '/tmp/bootc-ssh-auth-keys-4aj7p2kw', '--karg', 'rw', '--karg', 'console=tty0', '--karg', 'console=ttyS0', '/run/osbuild/mounts']' returned non-zero exit status 1.
mount/part1 (org.osbuild.fat): umount: /store/tmp/buildroot-tmp-mclvonqd/mounts/boot/efi unmounted
mount/part2 (org.osbuild.ext4): umount: /store/tmp/buildroot-tmp-mclvonqd/mounts/boot unmounted
mount/part3 (org.osbuild.ext4): umount: /store/tmp/buildroot-tmp-mclvonqd/mounts/ unmounted

which I cannot reproduce locally so probably related to the env in GH that this runs in :/

@mvo5 mvo5 force-pushed the switch-to-bootc-install-to-fs branch 2 times, most recently from 69c563a to 0b616b7 Compare April 2, 2024 14:48
@achilleas-k
Copy link
Member

Seems containers/bootc@b8f2b6d#diff-fe45e8cd8b566039edc4c6a4e80f355754d32078b45e465a407e22e27c39741aR649 should fix that. What version are we using right now in tests?

With the switch to bootc we need to adjust the testing. We inject
a root ssh key now and just use that for login.
@mvo5 mvo5 force-pushed the switch-to-bootc-install-to-fs branch from 0b616b7 to 1ebdf76 Compare April 2, 2024 16:39
@mvo5
Copy link
Collaborator Author

mvo5 commented Apr 2, 2024

Seems containers/bootc@b8f2b6d#diff-fe45e8cd8b566039edc4c6a4e80f355754d32078b45e465a407e22e27c39741aR649 should fix that. What version are we using right now in tests?

Thanks! Nice one I'm not sure what we have, maybe we should print this as part of the metadata for the tests what version of bootc is used. I will look into this, will make understanding issues like this a lot easier :)

[edit: looks like this happens with 2024-04-02T17:09:50.1335407Z _ ERROR at setup of test_image_boots[quay.io/centos-bootc/centos-bootc:stream9,raw,arm64] so maybe just another case of a missing qemu-user syscall? I will investigate]

@achilleas-k
Copy link
Member

what version of bootc is used

Since we're pulling containers, the container ID should be enough, but that might be adding extra steps to the debugging process so maybe you're right. Perhaps running the container and doing a rpm -qa would be a good bit of metadata to have.

@mvo5
Copy link
Collaborator Author

mvo5 commented Apr 2, 2024

Hm, this will not work as is I think - centos-bootc is at booc 0.1.7 afaict but we need containers/bootc#407 from 0.1.9 to make the ssh login work. We could explore alternatives of course to the root ssh keys …

@@ -233,12 +235,13 @@ func manifestFromCobra(cmd *cobra.Command, args []string) ([]byte, error) {
}

imgref := args[0]
configFile, _ := cmd.Flags().GetString("config")
isoConfigFile, _ := cmd.Flags().GetString("iso-config")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the config actually didn't apply to the ISO at all? Or am I missing something?

@@ -262,79 +261,6 @@ The following volumes can be mounted inside the container:
| `/store` | Used for the [osbuild store](https://www.osbuild.org/) | No |
| `/rpmmd` | Used for the DNF cache | No |

## 📝 Build config

A build config is a JSON file with customizations for the resulting image. A path to the file is passed via the `--config` argument. The customizations are specified under a `blueprint.customizations` object.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be helpful to have a realtime sync on this? I sent an invite.

@@ -144,7 +144,6 @@ Flags:
| Argument | Description | Default Value |
|------------------|------------------------------------------------------------------|:-------------:|
| **--chown** | chown the ouput directory to match the specified UID:GID | ❌ |
| **--config** | Path to a [build config](#-build-config) | ❌ |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is this is used elsewhere so I think we may be partially committed short term.

(There is some things related to this, like the fact I don't quite understand why it's JSON and not TOML as seems would be expected for blueprints, but that's an aside)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand why it's JSON and not TOML

No real (good) reason other than the initial version BIB was heavily based on https://github.com/osbuild/images/blob/main/cmd/build/main.go, which is used for test image builds and it was simpler to use the json formatting for config files.

Since we're moving towards making it more explicit that this config is an instance of an Image Builder blueprint, I agree that we should make it a toml to make the connection clearer.

manifestCmd.Flags().String("rpmmd", "/rpmmd", "rpm metadata cache directory")
manifestCmd.Flags().String("target-arch", "", "build for the given target architecture (experimental)")
manifestCmd.Flags().StringArray("type", []string{"qcow2"}, fmt.Sprintf("image types to build [%s]", allImageTypesString()))
manifestCmd.Flags().Bool("local", false, "use a local container rather than a container from a registry")
// XXX: hide from help?
manifestCmd.Flags().String("experimental-root-ssh-authorized-key", "", "authorized ssh key for root as string")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not opposed to this, but the more I've been thinking about this (you probably saw) the more I feel we need to build up the story of using blueprints/kickstarts in container builds and align that with disk image generation and not emphasize one-offs like this.

And yes I know I added it to bootc install, but the use case is different there in a way because the bootc install is really low level and I was thinking of it more like a generic low level escape hatch.

But here we can be more opinionated I think.

@cgwalters
Copy link
Contributor

Hm, this will not work as is I think - centos-bootc is at booc 0.1.7 afaict but we need containers/bootc#407 from 0.1.9 to make the ssh login work. We could explore alternatives of course to the root ssh keys …

I am working to unstick this...as you might imagine, Random Process Stuff is holding it up...

@mvo5
Copy link
Collaborator Author

mvo5 commented Apr 3, 2024

Seems containers/bootc@b8f2b6d#diff-fe45e8cd8b566039edc4c6a4e80f355754d32078b45e465a407e22e27c39741aR649 should fix that. What version are we using right now in tests?

Thanks! Nice one I'm not sure what we have, maybe we should print this as part of the metadata for the tests what version of bootc is used. I will look into this, will make understanding issues like this a lot easier :)

[edit: looks like this happens with 2024-04-02T17:09:50.1335407Z _ ERROR at setup of test_image_boots[quay.io/centos-bootc/centos-bootc:stream9,raw,arm64] so maybe just another case of a missing qemu-user syscall? I will investigate]

This is indeed another case of a missing syscall in qemu-user, I attached the fix here and will test/sent it to upstream
0001-linux-user-Add-FITRIM-ioctl.patch.txt

[submitted upstream as https://www.mail-archive.com/[email protected]/msg1034508.html]

@@ -265,8 +261,6 @@ func manifestForISO(c *ManifestConfig, rng *rand.Rand) (*manifest.Manifest, erro
},
}

img.ISOLabelTmpl = "Container-Installer-%s"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want something like img.ISOLabel = fmt.Sprintf("Container-Installer-%s", c.Architecture.String()) here instead. I'm not deeply familiar with ISOs, but I think we should keep the label.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I fixed this in #342

@mvo5
Copy link
Collaborator Author

mvo5 commented Apr 5, 2024

Fwiw, this is being reworked right now so that the --config option stays and that user customization works as before (by mounting the deployment root and running the org.osbuild.users stage in this environment), see osbuild/images@main...mvo5:images:bootc-disk-customize-again for the in progress PR that will be split out into smaller ones like osbuild/images#563.

@cgwalters
Copy link
Contributor

This is indeed another case of a missing syscall in qemu-user, I attached the fix here and will test/sent it to upstream
0001-linux-user-Add-FITRIM-ioctl.patch.txt

That makes sense and looks generally useful, thoguh containers/bootc#462 is arguably what we should be doing here, right?

@cgwalters
Copy link
Contributor

Fwiw, this is being reworked right now so that the --config option stays

(We covered some of this in relatime chat, but I want to record here too)
I want to apologize as I am sure it seems like I have been giving conflicting messaging here. It's...complicated. I really do want to make it easy and nice to bind user state into the container image. At the same time though, a ton of people are going to enter through Anaconda and set up users and ssh keys there...and it all works by doing the same thing that the bib config is doing in writing to /etc and /var in the deployment root.

Basically I think there are potential use cases for multiple paths:

  • Baking into container image
  • Baking into disk image
  • Fetching dynamically (cloud-init, FreeIPA, etc.)

In particular an advantage of the disk image flow is that in some cases, it's easier to "secure" the disk image than it is the container image, and having "bootstrap secrets" only in the disk images makes a lot of sense.

@mvo5
Copy link
Collaborator Author

mvo5 commented Apr 10, 2024

Given that we agreed we cannot drop customizations I close this one, the latest approach (that keeps customizations) is in #342

@mvo5 mvo5 closed this Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants