Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix canonical image ref resolution #3434

Merged
merged 1 commit into from
Nov 11, 2024

Conversation

m1kola
Copy link
Member

@m1kola m1kola commented Nov 5, 2024

Description of the change:

Previously in release scripts we were using docker inspect which seems to sort .RepoDigests and it makes selection of the correct digest difficult because platform-specific digest can appear at a different index depending on the digests in the list.

Motivation for the change:

olm.yaml is currently being generated with platform-specific image ref. This PR fixes resolution of canonical image refs.

Related to #3419

Architectural changes:

N/A

Testing remarks:

Create a new tag to trigger release automation.

Reviewer Checklist

  • Implementation matches the proposed design, or proposal is updated to match implementation
  • Sufficient unit test coverage
  • Sufficient end-to-end test coverage
  • Bug fixes are accompanied by regression test(s)
  • e2e tests and flake fixes are accompanied evidence of flake testing, e.g. executing the test 100(0) times
  • tech debt/todo is accompanied by issue link(s) in comments in the surrounding code
  • Tests are comprehensible, e.g. Ginkgo DSL is being used appropriately
  • Docs updated or added to /doc
  • Commit messages sensible and descriptive
  • Tests marked as [FLAKE] are truly flaky and have an issue
  • Code is properly formatted

@acornett21
Copy link

I'm not sure this PR is needed, also I don't really like the idea of maintaining custom code for this. I think the issue lies with using podman locally and docker in the CI. So the results you get locally do not match the CI, since this tools hand back different information for the query in the CI. ie podman hands back 2 digests (the system arc digest and manifest-list digest), where docker only hands back the manifest-list.

@acornett21
Copy link

I verified that this PR isn't needed, and that podman and docker function differently, so what developers see locally (when using podman) will be different then what they see in docker (locally or in the pipeline)

sample gha

name: docker-sha-test

on:
  push:
    branches:
      - main

jobs:
  test-docker:
    name: test docker sha
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      
      - name: Set up QEMU
        uses: docker/setup-qemu-action@v3

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Inspect Sha
        run: |
          docker pull quay.io/operator-framework/olm:v0.30.0
          docker inspect --format='{{index .RepoDigests 0}}' quay.io/operator-framework/olm:v0.30.0

the output

08cf91027712: Pull complete
Digest: sha256:79b31e9a42f5c1a4f4eb85521aa4675fb82869ce8009a4476f4f2c5b7ae945a8
Status: Downloaded newer image for quay.io/operator-framework/olm:v0.30.0
quay.io/operator-framework/olm:v0.30.0
quay.io/operator-framework/olm@sha256:79b31e9a42f5c1a4f4eb85521aa4675fb82869ce8009a4476f4f2c5b7ae945a8

@m1kola
Copy link
Member Author

m1kola commented Nov 5, 2024

@acornett21 I don't think it is podman vs docker. You can see the sha for in release job for v0.30.0 tag here: it prints 36b43038867f685a32ef862e619da04e27b6b0dfa21e3201655d17aa45b26db5 which is v0.30.0-amd64. And we use docker in GH action.

In the logs:

docker pull quay.io/operator-framework/olm:v0.30.0
v0.30.0: Pulling from operator-framework/olm
Digest: sha256:79b31e9a42f5c1a4f4eb85521aa4675fb82869ce8009a4476f4f2c5b7ae945a8

And then it goes:

/home/runner/go/bin/yq-v3.0.0-20201202084205-8846255d1c37 w -i deploy/upstream/values.yaml olm.image.ref quay.io/operator-framework/olm@sha256:36b43038867f685a32ef862e619da04e27b6b0dfa21e3201655d17aa45b26db5
/home/runner/go/bin/yq-v3.0.0-20201202084205-8846255d1c37 w -i deploy/upstream/values.yaml catalog.image.ref quay.io/operator-framework/olm@sha256:36b43038867f685a32ef862e619da04e27b6b0dfa21e3201655d17aa45b26db5
/home/runner/go/bin/yq-v3.0.0-20201202084205-8846255d1c37 w -i deploy/upstream/values.yaml package.image.ref quay.io/operator-framework/olm@sha256:36b43038867f685a32ef862e619da04e27b6b0dfa21e3201655d17aa45b26db5
/home/runner/go/bin/yq-v3.0.0-20201202084205-8846255d1c37 w -i deploy/upstream/values.yaml -- catalog.opmImageArgs "--opmImage=quay.io/operator-framework/opm@sha256:077e654da39d055ea56edc91faa2c36d5e8f4dec58db2017fed24a80246b6ca7

The last bit is where we replace image ref to a canonical one with yq. Resulting olm.yaml had this sha.

It seems like docker inspect results are non-deterministic.

I also tried other tools: I tried skapeo : it requires platform override on mac. There is also docker buildx imagetools which seems to be doing something similar, but it is docker-specific (e.g. not compatible with podman).

I think this is a small and simple enough to not cause a lot of maintenance burden. And as we learnt with docker inspect - using pre-existing tools doesn't save from head ache 🙂

Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
@m1kola m1kola force-pushed the fix_release_automation branch 3 times, most recently from bf538ed to 8d91d0d Compare November 7, 2024 13:23
Makefile Outdated Show resolved Hide resolved
@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 7, 2024
@m1kola m1kola force-pushed the fix_release_automation branch from 8d91d0d to 2c5f4c9 Compare November 7, 2024 13:32
@m1kola
Copy link
Member Author

m1kola commented Nov 7, 2024

I switched from docker inspect to getting the digest docker images which should hopefully give consistent results between podman and docker.

@acornett21 @perdasilva please take another look.

@m1kola m1kola requested a review from perdasilva November 7, 2024 13:35
@m1kola
Copy link
Member Author

m1kola commented Nov 8, 2024

We found out that there is difference between docker and podman in images result so it is not compatible.

/hold

Previously in release scripts we were using `docker inspect`
which seems to sort .RepoDigests and it makes selection of the
correct digest difficult because platform-specific digest can
appear at a different index depending on the digests in the list.

Signed-off-by: Mikalai Radchuk <[email protected]>
@m1kola m1kola force-pushed the fix_release_automation branch from 2c5f4c9 to 02fad30 Compare November 8, 2024 13:29
@m1kola
Copy link
Member Author

m1kola commented Nov 8, 2024

Moving back to the original PR with a custom go utility to resolve a canonical reference. This should be portable and we are re-using an existing dependency.

I however moved utility and YQ calls into scripts/package_release.sh for a few reasons:

  1. So we exit on an error without a lot of complicated checks in Makefile. Related comment: Fix canonical image ref resolution #3434 (comment)
  2. So we do not run resolution multiple times (in master we run docker inspect multiple times for the same values)

@m1kola
Copy link
Member Author

m1kola commented Nov 8, 2024

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 8, 2024
Copy link
Collaborator

@perdasilva perdasilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!!! I know it's been a journey XD

@perdasilva perdasilva added this pull request to the merge queue Nov 11, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 11, 2024
@m1kola m1kola added this pull request to the merge queue Nov 11, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 11, 2024
@m1kola m1kola added this pull request to the merge queue Nov 11, 2024
Merged via the queue into operator-framework:master with commit 9ec1b02 Nov 11, 2024
12 checks passed
@m1kola m1kola deleted the fix_release_automation branch November 11, 2024 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants