-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calico Helm Chart upgrade fails after upgrade from rke2 v1.28.8+rke2r1 to v1.28.12+rke2r1 / v1.29.6+rke2r1 #6633
Comments
It looks like for some reason the Helm job to upgrade the chart was interrupted while upgrading the chart. The helm controller responded by trying to uninstall and reinstall the chart, but the uninstall job was also interrupted - so now the chart is stuck in the "uninstalling" status. You might try deleting the Helm secrets for the rke2-calico-crd release, and rke-calico as well if necessary. This should allow it to successfully reinstall the chart. What process did you use to upgrade your cluster? We do not generally see issues with the Helm jobs being interrupted while upgrading, unless the upgrade is interrupted partway through, leaving nodes deploying conflicting component versions. |
Was there any recovery from this? We ran into this issue yesterday and had to restore controller VM and etcd from snapshots. The symptoms and logs match exactly what was posted above. We initially attempted to install the CRDs and recreate the required resources, but calico controller continued to crashloop. Ultimately, the restore from snapshots worked, but we actually had to do that twice as after adding additional controllers ,the helm upgrade was re-triggered and we had to restart the process. We're now currently running with just the one controller - not an ideal state. |
probably this projectcalico/calico#9068, which was fixed upstream, but you will need to wait likely quite some time for this fix becoming available in rke and rancher @brandond is there any possibility in rke2 to override the calico version being deployed? |
Calico 3.28.2 should go into next month's releases: rancher/rke2-charts#524 The issue is in the chart itself, so no you can't just bump the version of calico that the chart deploys. You'll need to wait for us to update the chart in RKE2. |
This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 45 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions. |
Environmental Info:
RKE2 Version: v1.28.8+rke2r1
:~ # rke2 -v
rke2 version v1.28.8+rke2r1 (42cab2f)
go version go1.21.8 X:boringcrypto
Node(s) CPU architecture, OS, and Version:
Linux hostname 5.3.18-150300.59.161-default #1 SMP Thu May 9 06:59:05 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration:
3 Master 3 Worker nodes
Describe the bug:
We are trying to upgrade rke2 from v1.28.8+rke2r1(fresh install) to v1.28.12+rke2r1 / v1.29.6+rke2r1
After upgrade rke2 service comes up but we see all the helm jobs fails for system component calico. Helm Jobs are retriggered in continuous loop(possibly trying to upgrade the above components)
For some reason instead of upgrading the calico chart, It tries to uninstall the tigera operator CRDs and calico CRDs. In this process it hangs as resources are still present. Please see below log output for calico CRD job.
kubectl get crds | grep -i calico --> No result
kubectl logs job/helm-install-rke2-calico-crd -n kube-system -f
The text was updated successfully, but these errors were encountered: