Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove section about stopping certmonger #473

Merged
merged 1 commit into from
Jun 17, 2024

Conversation

xek
Copy link
Contributor

@xek xek commented May 29, 2024

Disabling of certmonger is now implemented in EDPM and
enables rollback at before the point of no return
(before EDPM adoption).

https://issues.redhat.com/browse/OSPRH-7022

Depends-On: openstack-k8s-operators/edpm-ansible#680

@xek xek force-pushed the certmonger_removal branch from 2779f98 to 3ddd261 Compare May 29, 2024 12:34
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/1186313213d94fc9a5317a595e50df32

✔️ data-plane-adoption-osp-17-to-extracted-crc SUCCESS in 2h 33m 29s
data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph FAILURE in 2h 37m 38s
✔️ adoption-docs-preview SUCCESS in 1m 41s

@xek xek force-pushed the certmonger_removal branch from 3ddd261 to af7104b Compare June 5, 2024 13:12
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/2553ea5e341c40ceb6d7af43306fcae3

data-plane-adoption-osp-17-to-extracted-crc FAILURE in 1h 51m 20s
data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph FAILURE in 1h 51m 32s
✔️ adoption-docs-preview SUCCESS in 1m 43s

@xek xek force-pushed the certmonger_removal branch from af7104b to 67e2e52 Compare June 12, 2024 13:12
@xek xek marked this pull request as ready for review June 12, 2024 13:12
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/e67ffc7228a34c12a8911e189076a50c

data-plane-adoption-osp-17-to-extracted-crc FAILURE in 2h 12m 17s
data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph FAILURE in 2h 17m 13s
✔️ adoption-docs-preview SUCCESS in 1m 19s

@xek xek force-pushed the certmonger_removal branch from 67e2e52 to 8b7eb22 Compare June 13, 2024 19:48
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/8abc431bd1c64e9c8753c8b6507ec713

✔️ data-plane-adoption-osp-17-to-extracted-crc SUCCESS in 2h 34m 16s
data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph FAILURE in 2h 11m 56s
✔️ adoption-docs-preview SUCCESS in 1m 25s

@xek xek requested a review from klgill June 14, 2024 08:43
@xek xek force-pushed the certmonger_removal branch from 8b7eb22 to 15d6ccf Compare June 14, 2024 08:53
Copy link
Contributor

@marios marios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks ok per the title, but the immediate question is 'why?'
i mean, please consider including more information in the commit message to give some more context about the change and why you proposed it

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/9d1c8ab1c62a4ad49ad33302615cec07

✔️ data-plane-adoption-osp-17-to-extracted-crc SUCCESS in 2h 32m 03s
data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph FAILURE in 2h 13m 09s
✔️ adoption-docs-preview SUCCESS in 1m 20s

@xek xek force-pushed the certmonger_removal branch from 15d6ccf to 49ac26d Compare June 14, 2024 11:51
@klgill
Copy link
Contributor

klgill commented Jun 14, 2024

This LGTM. Ideally, I'd like to check the doc output to make sure there are no formatting issues, but the check is taking a while.

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/7e2f377b62a042cfbd1cb43c430da1e3

✔️ data-plane-adoption-osp-17-to-extracted-crc SUCCESS in 2h 35m 48s
data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph RETRY_LIMIT in 31m 19s
✔️ adoption-docs-preview SUCCESS in 1m 57s

@xek
Copy link
Contributor Author

xek commented Jun 14, 2024

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/a055b52a30944d80b585d34d4acb83dd

data-plane-adoption-osp-17-to-extracted-crc RETRY_LIMIT in 11m 23s
data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph FAILURE in 2h 10m 20s
adoption-docs-preview POST_FAILURE in 1m 17s

@xek
Copy link
Contributor Author

xek commented Jun 14, 2024

The last run failed on tripleo-cleanup-tripleo-cleanup-openstack ansibleee job, but I can't find the pod logs for that job

EDIT: the pod logs location is changing, I was able to get those from the sos report (sosreport-crc-pjmnl-master-0-UntarWithArg-i.tar.xz)

Disabling of certmonger is now implemented in EDPM and
enables rollback at before the point of no return
(before EDPM adoption).
Depends-On: openstack-k8s-operators/edpm-ansible#663
@xek xek force-pushed the certmonger_removal branch from 49ac26d to 775abd6 Compare June 14, 2024 20:31
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/b5a745fbc2a14978a6606a21131ca671

data-plane-adoption-osp-17-to-extracted-crc RETRY_LIMIT in 11m 05s
data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph FAILURE in 2h 08m 14s
✔️ adoption-docs-preview SUCCESS in 1m 20s

@xek
Copy link
Contributor Author

xek commented Jun 15, 2024

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/68f8599a010345cdbfacf6a3c9f65e29

✔️ data-plane-adoption-osp-17-to-extracted-crc SUCCESS in 2h 30m 13s
data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph FAILURE in 1h 35m 42s
adoption-docs-preview FAILURE in 1m 16s

@xek
Copy link
Contributor Author

xek commented Jun 15, 2024

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/9a65117d937d4d1e941f976ba1eb9107

✔️ data-plane-adoption-osp-17-to-extracted-crc SUCCESS in 2h 30m 30s
data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph FAILURE in 2h 06m 34s
✔️ adoption-docs-preview SUCCESS in 1m 23s

@xek
Copy link
Contributor Author

xek commented Jun 16, 2024

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/cda4e7dd6b844d86b9627cb822fe5242

✔️ data-plane-adoption-osp-17-to-extracted-crc SUCCESS in 2h 28m 58s
data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph FAILURE in 2h 07m 16s
✔️ adoption-docs-preview SUCCESS in 1m 20s

@marios
Copy link
Contributor

marios commented Jun 17, 2024

the no-ceph job is failing like https://logserver.rdoproject.org/73/473/775abd6d7d6247adf77fff2f8bcf2aad377d697a/github-check/data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph/8f68e86/controller/data-plane-adoption-tests-repo/data-plane-adoption/tests/logs/test_minimal_out_2024-06-16T18:33:25EDT.log

TASK [dataplane_adoption : Wait for the deployment to finish] ******************

...

echo 'There are failed AnsibleEE jobs: tripleo-cleanup-tripleo-cleanup-openstack'", "+ exit 1"], "stdout": "There are failed AnsibleEE jobs: tripleo-cleanup-tripleo-cleanup-openstack", "stdout_lines": ["There are failed AnsibleEE jobs: tripleo-cleanup-tripleo-cleanup-openstack


and looking a bit closer at pod logs, in pod logs via https://logserver.rdoproject.org/73/473/775abd6d7d6247adf77fff2f8bcf2aad377d697a/github-check/data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph/8f68e86/controller/ci-framework-data/logs/openstack-k8s-operators-openstack-must-gather/sos-reports/_all_nodes/sosreport-crc-pjmnl-master-0-UntarWithArg-i.tar.xz

2024-06-16T23:08:29.465763082+00:00 stdout F TASK [osp.edpm.edpm_install_certs : Backup certificate requests] ***************

2024-06-16T23:08:29.637362504+00:00 stdout F �[0;31m    "msg": "Could not find or access '/var/lib/certmonger/requests' on the Ansible Controller.\nIf you are using a module and expect the file to exist on the remote, see the remote_src

This seems like a new issue, but not sure if related to this patch (don't think so, but unsure).

Copy link
Contributor

@marios marios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we please hold merge until we can confirm latest 'certmonger' issue ... it so far only seems to hit this patch? https://review.rdoproject.org/zuul/builds?job_name=data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph&skip=0

@xek
Copy link
Contributor Author

xek commented Jun 17, 2024

the no-ceph job is failing like https://logserver.rdoproject.org/73/473/775abd6d7d6247adf77fff2f8bcf2aad377d697a/github-check/data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph/8f68e86/controller/data-plane-adoption-tests-repo/data-plane-adoption/tests/logs/test_minimal_out_2024-06-16T18:33:25EDT.log

TASK [dataplane_adoption : Wait for the deployment to finish] ******************

...

echo 'There are failed AnsibleEE jobs: tripleo-cleanup-tripleo-cleanup-openstack'", "+ exit 1"], "stdout": "There are failed AnsibleEE jobs: tripleo-cleanup-tripleo-cleanup-openstack", "stdout_lines": ["There are failed AnsibleEE jobs: tripleo-cleanup-tripleo-cleanup-openstack

and looking a bit closer at pod logs, in pod logs via https://logserver.rdoproject.org/73/473/775abd6d7d6247adf77fff2f8bcf2aad377d697a/github-check/data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph/8f68e86/controller/ci-framework-data/logs/openstack-k8s-operators-openstack-must-gather/sos-reports/_all_nodes/sosreport-crc-pjmnl-master-0-UntarWithArg-i.tar.xz

2024-06-16T23:08:29.465763082+00:00 stdout F TASK [osp.edpm.edpm_install_certs : Backup certificate requests] ***************

2024-06-16T23:08:29.637362504+00:00 stdout F �[0;31m    "msg": "Could not find or access '/var/lib/certmonger/requests' on the Ansible Controller.\nIf you are using a module and expect the file to exist on the remote, see the remote_src

This seems like a new issue, but not sure if related to this patch (don't think so, but unsure).

This is fixed by openstack-k8s-operators/edpm-ansible#680 (it's in the depends-on), so looks like it's not being deployed with this change.

@cescgina
Copy link
Contributor

cescgina commented Jun 17, 2024

@xek @marios that's weird, zuul is really not picking up the depends-on:

      items:
      - branch: main
        change: '473'
        change_url: https://github.com/openstack-k8s-operators/data-plane-adoption/pull/473
        patchset: 775abd6d7d6247adf77fff2f8bcf2aad377d697a
        project:
          canonical_hostname: github.com
          canonical_name: github.com/openstack-k8s-operators/data-plane-adoption
          name: openstack-k8s-operators/data-plane-adoption
          short_name: data-plane-adoption
          src_dir: src/github.com/openstack-k8s-operators/data-plane-adoption
      job: data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph
      jobtags: []
      max_attempts: 1

the edpm-ansible patch should be in the items list, but it's not https://logserver.rdoproject.org/73/473/775abd6d7d6247adf77fff2f8bcf2aad377d697a/github-check/data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph/8f68e86/zuul-info/inventory.yaml

@xek
Copy link
Contributor Author

xek commented Jun 17, 2024

BTW not all runs with the failed AnsibleEE jobs: tripleo-cleanup-tripleo-cleanup-openstack failed with the same error, some of those runs fail in TASK [osp.edpm.edpm_tripleo_cleanup : Stop and disable tripleo services] with "Could not find the requested service tripleo_neutron_l3_agent.service: host" and "Could not find the requested service tripleo_neutron_ovs_agent.service: host"

@xek
Copy link
Contributor Author

xek commented Jun 17, 2024

recheck openstack-k8s-operators/edpm-ansible#680 has merged

@jistr
Copy link
Contributor

jistr commented Jun 17, 2024

Discussed at the CI call, we agreed this should be ok to merge now.

@jistr jistr merged commit 64039a7 into openstack-k8s-operators:main Jun 17, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants