Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

containers: Remove the need for yum update in Dockerfile #1043

Merged
merged 4 commits into from
Dec 16, 2024

Conversation

anoopcs9
Copy link
Collaborator

@anoopcs9 anoopcs9 commented Nov 7, 2024

In general it is not desirable to blindly update packages as a whole while building another from a base container image.

Historically this step was required due to the introduction of version specific installation(#331) of packages i.e, we extract the package version that comes with the base container image and try to install the matching development libraries which might be unavailable close to a new release happening in upstream. In order to overcome this short gap we came up with the idea of yum update(#510) to fetch whatever is the latest and then extract the version for further installation of development libraries.

This seemed to work until we discovered a different issue where updated versions for particular dependencies are pushed to standard repositories causing problems(#1038) during yum update.

Ceph repositories(and packages) are now more robust and DNF is capable of handling such situations to figure out the new/updated versions for packages even if a match is not found with the already installed package versions. Ideally it can never be the case that matching packages for each version are missing from a particular repository directory(only the links to the directories is supposed to change).

Thus in our best interest we avoid running yum update.

@anoopcs9 anoopcs9 marked this pull request as ready for review November 7, 2024 07:05
@anoopcs9
Copy link
Collaborator Author

anoopcs9 commented Nov 7, 2024

Verifying the imaginary situation where matching packages are absent in the configured repositories.

# rpm -q libcephfs2
libcephfs2-18.2.2-0.el9.x86_64

# grep baseurl /etc/yum.repos.d/ceph.repo 
baseurl=http://download.ceph.com/rpm-reef/el9/$basearch
baseurl=http://download.ceph.com/rpm-reef/el9/noarch
baseurl=http://download.ceph.com/rpm-reef/el9/SRPMS

Let's say 18.2.4 is released with updated package repositories but container images are not yet built with the updated packages:

# dnf list libcephfs2 --showduplicates
Failed to set locale, defaulting to C
Last metadata expiration check: 0:05:18 ago on Thu Nov  7 07:14:22 2024.
Installed Packages
libcephfs2.x86_64                                                                            2:18.2.2-0.el9                                                                             @Ceph  
Available Packages
libcephfs2.x86_64                                                                            1:17.2.5-1.el9s                                                                            ganesha
libcephfs2.x86_64                                                                            2:18.2.4-0.el9                                                                             Ceph   

# dnf list libcephfs-devel --showduplicates
Failed to set locale, defaulting to C
Last metadata expiration check: 0:04:34 ago on Thu Nov  7 07:14:22 2024.
Available Packages
libcephfs-devel.x86_64                                                                         1:17.2.5-1.el9s                                                                          ganesha
libcephfs-devel.x86_64                                                                         2:18.2.4-0.el9                                                                           Ceph   

Now we attempt to install development packages:

# dnf install libcephfs-devel
Failed to set locale, defaulting to C
Last metadata expiration check: 0:00:06 ago on Thu Nov  7 07:14:22 2024.
Dependencies resolved.
===============================================================================================================================================================================================
 Package                                                     Architecture                         Version                                      Repository                                 Size
===============================================================================================================================================================================================
Installing:
 libcephfs-devel                                             x86_64                               2:18.2.4-0.el9                               Ceph                                       31 k
Upgrading:
 ceph-base                                                   x86_64                               2:18.2.4-0.el9                               Ceph                                      5.1 M
 ceph-common                                                 x86_64                               2:18.2.4-0.el9                               Ceph                                       18 M
 ceph-exporter                                               x86_64                               2:18.2.4-0.el9                               Ceph                                      362 k
 ceph-grafana-dashboards                                     noarch                               2:18.2.4-0.el9                               Ceph-noarch                                24 k
 ceph-immutable-object-cache                                 x86_64                               2:18.2.4-0.el9                               Ceph                                      142 k
 ceph-mds                                                    x86_64                               2:18.2.4-0.el9                               Ceph                                      2.1 M
 ceph-mgr                                                    x86_64                               2:18.2.4-0.el9                               Ceph                                      1.5 M
 ceph-mgr-cephadm                                            noarch                               2:18.2.4-0.el9                               Ceph-noarch                               138 k
 ceph-mgr-dashboard                                          noarch                               2:18.2.4-0.el9                               Ceph-noarch                               3.5 M
 ceph-mgr-diskprediction-local                               noarch                               2:18.2.4-0.el9                               Ceph-noarch                               7.4 M
 ceph-mgr-k8sevents                                          noarch                               2:18.2.4-0.el9                               Ceph-noarch                                22 k
 ceph-mgr-modules-core                                       noarch                               2:18.2.4-0.el9                               Ceph-noarch                               246 k
 ceph-mgr-rook                                               noarch                               2:18.2.4-0.el9                               Ceph-noarch                                49 k
 ceph-mon                                                    x86_64                               2:18.2.4-0.el9                               Ceph                                      4.7 M
 ceph-osd                                                    x86_64                               2:18.2.4-0.el9                               Ceph                                       17 M
 ceph-prometheus-alerts                                      noarch                               2:18.2.4-0.el9                               Ceph-noarch                                15 k
 ceph-radosgw                                                x86_64                               2:18.2.4-0.el9                               Ceph                                      7.7 M
 ceph-selinux                                                x86_64                               2:18.2.4-0.el9                               Ceph                                       25 k
 ceph-volume                                                 noarch                               2:18.2.4-0.el9                               Ceph-noarch                               264 k
 cephadm                                                     noarch                               2:18.2.4-0.el9                               Ceph-noarch                               224 k
 cephfs-mirror                                               x86_64                               2:18.2.4-0.el9                               Ceph                                      221 k
 libcephfs2                                                  x86_64                               2:18.2.4-0.el9                               Ceph                                      709 k
 libcephsqlite                                               x86_64                               2:18.2.4-0.el9                               Ceph                                      166 k
 librados2                                                   x86_64                               2:18.2.4-0.el9                               Ceph                                      3.3 M
 libradosstriper1                                            x86_64                               2:18.2.4-0.el9                               Ceph                                      474 k
 librbd1                                                     x86_64                               2:18.2.4-0.el9                               Ceph                                      3.0 M
 librgw2                                                     x86_64                               2:18.2.4-0.el9                               Ceph                                      4.5 M
 python3-ceph-argparse                                       x86_64                               2:18.2.4-0.el9                               Ceph                                       45 k
 python3-ceph-common                                         x86_64                               2:18.2.4-0.el9                               Ceph                                      129 k
 python3-cephfs                                              x86_64                               2:18.2.4-0.el9                               Ceph                                      162 k
 python3-rados                                               x86_64                               2:18.2.4-0.el9                               Ceph                                      321 k
 python3-rbd                                                 x86_64                               2:18.2.4-0.el9                               Ceph                                      297 k
 python3-rgw                                                 x86_64                               2:18.2.4-0.el9                               Ceph                                       99 k
 rbd-mirror                                                  x86_64                               2:18.2.4-0.el9                               Ceph                                      3.0 M
 rbd-nbd                                                     x86_64                               2:18.2.4-0.el9                               Ceph                                      172 k
Installing dependencies:
 librados-devel                                              x86_64                               2:18.2.4-0.el9                               Ceph                                      127 k

Transaction Summary
===============================================================================================================================================================================================
Install   2 Packages
Upgrade  35 Packages

Total download size: 85 M
Is this ok [y/N]: n

It works.

@anoopcs9 anoopcs9 added extended-review A submitter or reviewer feels the PR needs an extended review period no-API This PR does not include any changes to the public API of a go-ceph package labels Nov 7, 2024
@phlogistonjohn
Copy link
Collaborator

Thanks for the detailed explanation. However, I think it's based on a hypothetical practice that the ceph project doesn't follow. Let me try to explain:

the dnf list command with --showduplicates works because one or more repositories that dnf "knows about" has packages with the same name but different versions. Example, repo ceph-foo contains:
libcephfs-devel-18.2.4
libcephfs-devel-18.2.2

That would be fine if ceph rpm repos were "additive". However, that's not how ceph organizes the repos. Let's take reef for example. On download.ceph.com there are dirs named for ceph releases. For example, "rpm-18.2.4", "rpm-18.2.2", and so on. If you look in each of these you will see that they only contain packages with a matching version number. rpm-18.2.4 doesn't contain both 18.2.4 and 18.2.2 versions. Then there's a link to the latest release "rpm-reef" (it's not obvious it's a symlink from the webserver but I've been told that's what it is).

If you look at a release container image you'll see that dnf config points at rpm-reef:

[root@1fe2d7b1656b /]# cat /etc/yum.repos.d/ceph.repo 
[Ceph]
name=Ceph packages for $basearch
baseurl=http://download.ceph.com/rpm-reef/el9/$basearch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc

[Ceph-noarch]
name=Ceph noarch packages
baseurl=http://download.ceph.com/rpm-reef/el9/noarch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc

[ceph-source]
name=Ceph source packages
baseurl=http://download.ceph.com/rpm-reef/el9/SRPMS
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc

So, for example, when the image is built with 18.2.2 but in the mean time rpm-reef becomes rpm-18.2.4 dnf won't even see any 18.2.2 rpms in the repo. I don't think dnf can handle the case the way you described it... but please correct me if I'm wrong. We could consider asking the ceph team to build release container images with dnf configured with rpm-X.Y.Z instead of rpm-<codename> - I know Dan is in the process of changing things wrt the container build and may be open to suggestions.

Let me know if you think my concerns are valid or you have any other thoughts.

@phlogistonjohn
Copy link
Collaborator

link: https://download.ceph.com/

@anoopcs9
Copy link
Collaborator Author

anoopcs9 commented Nov 8, 2024

the dnf list command with --showduplicates works because one or more repositories that dnf "knows about" has packages with the same name but different versions. Example, repo ceph-foo contains: libcephfs-devel-18.2.4 libcephfs-devel-18.2.2

The situation I created above only had libcephfs-devel-18.2.4 in the repositories(barring the other older version from ganesha repo). libcephfs-devel is initially not present with the base container image. --showduplicates is just an extension to list sub command.

That would be fine if ceph rpm repos were "additive". However, that's not how ceph organizes the repos. Let's take reef for example. On download.ceph.com there are dirs named for ceph releases. For example, "rpm-18.2.4", "rpm-18.2.2", and so on. If you look in each of these you will see that they only contain packages with a matching version number. rpm-18.2.4 doesn't contain both 18.2.4 and 18.2.2 versions. Then there's a link to the latest release "rpm-reef" (it's not obvious it's a symlink from the webserver but I've been told that's what it is).

If I am not wrong that's how RPM repositories are found to be structured in general.

If you look at a release container image you'll see that dnf config points at rpm-reef:

[root@1fe2d7b1656b /]# cat /etc/yum.repos.d/ceph.repo 
[Ceph]
name=Ceph packages for $basearch
baseurl=http://download.ceph.com/rpm-reef/el9/$basearch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc

[Ceph-noarch]
name=Ceph noarch packages
baseurl=http://download.ceph.com/rpm-reef/el9/noarch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc

[ceph-source]
name=Ceph source packages
baseurl=http://download.ceph.com/rpm-reef/el9/SRPMS
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://download.ceph.com/keys/release.asc

Yes, my artificial situation above explicitly displays these baseurl values.

So, for example, when the image is built with 18.2.2 but in the mean time rpm-reef becomes rpm-18.2.4 dnf won't even see any 18.2.2 rpms in the repo. I don't think dnf can handle the case the way you described it... but please correct me if I'm wrong.

DNF is intelligent enough to find an updated version of a package as soon as it refreshes the repositories invalidating outdated repodata. Thus in the situation I created above it finds libcephfs-devel-18.2.4 from updated rpm-reef(pointing at rpm-18.2.4) and subsequently decides to update every other inter-dependent packages which are available from the repository link currently configured with rpm-reef.

We could consider asking the ceph team to build release container images with dnf configured with rpm-X.Y.Z instead of rpm-<codename> - I know Dan is in the process of changing things wrt the container build and may be open to suggestions.

That would be ideal but definitely not a blocker for us in this situation.

Let me know if you think my concerns are valid or you have any other thoughts.

I hope I filled in the gaps from my previous explanation. Accordingly I think the changes(or in other words DNF 😉 ) can handle the anticipated problem during release time.

@phlogistonjohn
Copy link
Collaborator

OK, I'm not entirely convinced but I'm convinced enough to run the experiment. :-)

@phlogistonjohn
Copy link
Collaborator

@Mergifyio rebase

Copy link

mergify bot commented Nov 12, 2024

rebase

✅ Branch has been successfully rebased

@phlogistonjohn phlogistonjohn force-pushed the containers-rm-yum-update branch from 65f00b8 to 664d1b0 Compare November 12, 2024 20:59
@nixpanic
Copy link
Member

In general it is not a best practice to blindly update packages as a whole while building another from a base container image.

Can you point out where this best practice recommendation comes from?

@anoopcs9
Copy link
Collaborator Author

In general it is not a best practice to blindly update packages as a whole while building another from a base container image.

Can you point out where this best practice recommendation comes from?

May be I went a step ahead and bent the wording from "desirable" to "best practice" out of an offline discussion involving people associated with building containers (ceph in this context). But I tend to agree that it becomes unpredictable with a moving target like CentOS Stream, where updates to any one of the dependencies for the already built or installed packages can affect the creation of new container images out of it. Even if we don't generalize, I would still suggest removing the blind update step, provided we can achieve the goal (of overcoming the anticipated problem during release time) in its absence.

What do you think?

@nixpanic
Copy link
Member

May be I went a step ahead and bent the wording from "desirable" to "best practice" out of an offline discussion involving people associated with building containers (ceph in this context). But I tend to agree that it becomes unpredictable with a moving target like CentOS Stream, where updates to any one of the dependencies for the already built or installed packages can affect the creation of new container images out of it. Even if we don't generalize, I would still suggest removing the blind update step, provided we can achieve the goal (of overcoming the anticipated problem during release time) in its absence.

I don't really have a strong opinion about it. However not all container-images get rebuilt regularly and are therefore missing security/bugfix updates. For Ceph-CSI we receive the occasional requests to rebuild images with updates installed. I understand the position from image producers to have fully tested images, but I also understand the users, as they want fewer known bugs/CVEs in the images 🤷‍♂️

@anoopcs9
Copy link
Collaborator Author

May be I went a step ahead and bent the wording from "desirable" to "best practice" out of an offline discussion involving people associated with building containers (ceph in this context). But I tend to agree that it becomes unpredictable with a moving target like CentOS Stream, where updates to any one of the dependencies for the already built or installed packages can affect the creation of new container images out of it. Even if we don't generalize, I would still suggest removing the blind update step, provided we can achieve the goal (of overcoming the anticipated problem during release time) in its absence.

I don't really have a strong opinion about it. However not all container-images get rebuilt regularly and are therefore missing security/bugfix updates. For Ceph-CSI we receive the occasional requests to rebuild images with updates installed. I understand the position from image producers to have fully tested images, but I also understand the users, as they want fewer known bugs/CVEs in the images 🤷‍♂️

Here in this case, we don't have any official container images meant for distribution as part of each release. This is just a matter of creating CI (short-lived) containers where we can be less bothered by security or bugfix updates to packages.

@anoopcs9
Copy link
Collaborator Author

@Mergifyio refresh

Copy link

mergify bot commented Nov 26, 2024

refresh

✅ Pull request refreshed

@anoopcs9
Copy link
Collaborator Author

@Mergifyio rebase

Copy link

mergify bot commented Nov 26, 2024

rebase

✅ Nothing to do for rebase action

@anoopcs9
Copy link
Collaborator Author

@Mergifyio rebase

Copy link

mergify bot commented Nov 26, 2024

rebase

✅ Branch has been successfully rebased

@anoopcs9 anoopcs9 force-pushed the containers-rm-yum-update branch 2 times, most recently from 763f661 to 36ab7c5 Compare November 27, 2024 06:34
@anoopcs9
Copy link
Collaborator Author

Another demonstration of how dnf overcomes the anticipated problem with recently released v17.2.8. Please note that v17.2.8 doesn't have el8 container image compared to previous v17.2.7 where it didn't have an el9 variant. Therefore we manipulate an el9 container image for v17.2.7 by downgrading packages within v17.2.8.

$ podman pull quay.io/ceph/ceph:v17
. . .
$ podman run --rm -it --entrypoint /bin/bash quay.io/ceph/ceph:v17

# echo $CEPH_VERSION 
quincy

# rpm -q libcephfs2
libcephfs2-17.2.8-0.el9.x86_64

# sed -i 's/rpm-quincy/rpm-17.2.7/g' /etc/yum.repos.d/ceph.repo

# dnf downgrade ceph-common
. . .

# rpm -q libcephfs2
libcephfs2-17.2.7-0.el9.x86_64

# rpm -q libcephfs-devel
package libcephfs-devel is not installed

In effect we have a v17.2.7 el9 container. Now:

# sed -i 's/rpm-17.2.7/rpm-quincy/g' /etc/yum.repos.d/ceph.repo

# dnf list libcephfs2
Failed to set locale, defaulting to C
Last metadata expiration check: 0:00:21 ago on Tue Nov 26 11:12:49 2024.
Installed Packages
libcephfs2.x86_64                                                                             2:17.2.7-0.el9                                                                              @Ceph
Available Packages
libcephfs2.x86_64                                                                             2:17.2.8-0.el9                                                                              Ceph 

# dnf list libcephfs-devel   
Failed to set locale, defaulting to C
Last metadata expiration check: 0:03:04 ago on Tue Nov 26 11:12:49 2024.
Available Packages
libcephfs-devel.x86_64                                                                           2:17.2.8-0.el9                                                                            Ceph

# dnf install libcephfs-devel

This ends up updating every ceph related packages to the latest v17.2.8 because of their inter dependency.

@nixpanic
Copy link
Member

I am not sure what you are trying to do, or why...

If you want to use Ceph v17.2.7, would you not use the quay.io/ceph/ceph:v17.2.7 container-image, and not install any updates? It will stay on CentOS 8 that way too.

As CentOS 8 is not maintained anymore (and neither is Ceph v17.2.7), Ceph moved it's base OS to CentOS Stream 9.

@anoopcs9
Copy link
Collaborator Author

I am not sure what you are trying to do, or why...

In the absence of yum update there is an anticipated problem w.r.t installing devel packages for various ceph libraries during the time of a ceph release(PR description explains it well I guess, if not let me know). This PR tries to remove the need for yum update and I wanted to demonstrate that it works without it.

If you want to use Ceph v17.2.7, would you not use the quay.io/ceph/ceph:v17.2.7 container-image, and not install any updates? It will stay on CentOS 8 that way too.

Our CI jobs doesn't depend on specific minor updates for a particular ceph release rather uses the more generic major release tag(v17, v18, v19 etc.). The situation is a special case when container images are not yet available for a minor update but RPM repositories are already updated to point at latest minor update. #1043 (comment) has some pointers on why this was considered to be a problem.

As CentOS 8 is not maintained anymore (and neither is Ceph v17.2.7), Ceph moved it's base OS to CentOS Stream 9.

Yes, and that's why I highlighted the explanation with a note in #1043 (comment).

@phlogistonjohn
Copy link
Collaborator

@Mergifyio rebase

In general it is not desirable to blindly update packages as a whole
while building another from a base container image.

Historically this step was required due to the introduction of version
specific installation[1] of packages i.e, we extract the package version
that comes with the base container image and try to install the matching
development libraries which might be unavailable close to a new release
happening in upstream. In order to overcome this short gap we came up
with the idea of `yum update`[2] to fetch whatever is the latest and
then extract the version for further installation of development
libraries.

This seemed to work until we discovered a different issue where updated
versions for particular dependencies are pushed to standard repositories
causing problems[3] during `yum update`.

Ceph repositories(and packages) are now more robust and DNF is capable
of handling such situations to figure out the new/updated versions for
packages even if a match is not found with the already installed package
versions. Ideally it can never be the case that matching packages for
each version are missing from a particular repository directory(only the
links to the directories is supposed to change).

Thus in our best interest we avoid running `yum update`.

[1] ceph#331
[2] ceph#510
[3] ceph#1038

Signed-off-by: Anoop C S <[email protected]>
Now that we have removed the `yum update` step, it doesn't make sense to
install the matching development packages based on the version already
present with the base container image. This is due to the fact that
their availability is not always guaranteed. Instead leave it up to DNF
to figure out if higher versions are available with the repositories.

Signed-off-by: Anoop C S <[email protected]>
Copy link

mergify bot commented Dec 11, 2024

rebase

✅ Branch has been successfully rebased

@phlogistonjohn phlogistonjohn force-pushed the containers-rm-yum-update branch from 36ab7c5 to adba452 Compare December 11, 2024 21:04
@anoopcs9 anoopcs9 removed the extended-review A submitter or reviewer feels the PR needs an extended review period label Dec 16, 2024
@mergify mergify bot merged commit daad7cc into ceph:master Dec 16, 2024
12 of 16 checks passed
@anoopcs9 anoopcs9 deleted the containers-rm-yum-update branch December 16, 2024 07:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-API This PR does not include any changes to the public API of a go-ceph package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants