-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Moving addons to higher priority layers doesn't prune from the lower layer #207
Comments
@avacaru the change of name of the helmrelease custom resource will prevent orphan/adoption processing. However I’d expect the app-logging resources to be deleted when the app-logging helm release is deleted by prune processing so the base-logging can be deployed but initially the app-layer pruning may still be in progress while the base-layer (which has nothing to prune) proceeds to the apply phase. The base-logging helm release will fail but should be retried. However Kraan just deploys the helm releases, it relies on the helm release being retried. check the retry settings? |
@paulcarlton-ww This is what I have in that HR's manifest: spec:
install:
remediation:
retries: -1
upgrade:
remediation:
retries: -1 I was expecting the same thing, even after the initial failure I was hoping that it will be fixed in the next reconciliation cycle. Also I can see this error in the logs of kraan-controller:
Does that help? |
The kraan log error is normal, this is the controller runtime encountering a resource version mismatch due to concurrent reconciliation of a layer, the standard practice in Kubernetes is to redo the reconcile using latest version of layer and it should be ok so Kraan schedules an immediate re-reconcile. I'd only be concerned if copious numbers of this error are being generated? The retry spec is correct. I'm wondering if this is a helm/helm-controller issue? If you restart the helm-controller or delete the helmrelease does that fix it? |
There is indeed a high number of these error, one aprox. every minute (the AddonsLayer interval setting). I've tried to delete the HR manually and then all the layers end up being successfully deployed. |
one a minute suggests the periodic sync of all layers, i.e. the controller reconciles all layers every minute. I expect this clashes with the repeated attempts to process the layer with the base-logging helmrelease in error in. I think this is a helm-controller/helm issue, all Kraan is responsible for is applying changes to HelmRelease objects, it relies on the helm-controller applying that change. The fact that deleting the HR fixes the issue suggests that something is not working right in the helm-controller, possibly due to some nuance of the way helm works |
I've checked the helm-controller logs and I can't see any errors about the initial app-logging HelmRelease failing to be deleted. All I see is that the reconciliation has succeeded. I am wondering, is the kraan-controller ever deleting the app-logging HR? Any other logs I can look into in order to track this issue down? |
Check kraan log, depending on verbosity setting you should see app-logging HR being deleted, but easier to just check using kubectl, it should be gone and all the resources it created should have been deleted too, check the deployment.apps object it should have been deleted, if it is still there with owner/annotation indicating that it belongs to app-logging HR then that is the cause of the issue but I assumed from your orginal description that this deletion had occurred. |
From a Helm-Controller maintainer... @avacaru Please raise a Helm-Controller issue at https://github.com/fluxcd/helm-controller/issues |
That is the actual problem, the old HR does not get removed. I can still see it when I run
Also, if I remove the old HR manually, then the reconciliation succeeds. All layers will have status Deployed, whereas they are stuck like this:
Can I get someone to reproduce this issue before I open an issue in the helm-controller, please? |
@avacaru The app-logging HR should have been pruned by Kraan controller, please post kraan logs here. If you need to re-run to recreate set log level to 4 for copious logging, thanks |
@paulcarlton-ww I've set log level to 4 and reproduced the bug. I can't see any error in the logs, the only reference to pruning is in these logs:
No other reference that would indicate the layer has been pruned of addons not present in the git source anymore. |
Can you post the complete log? |
Unfortunately not, but if you don't have an environment to reproduce this, what should I look for in the logs? |
@avacaru I will recreate this myself when I get time but that might not be for a few days |
@avacaru I've tested this using head of master code, seems to work fine in both directions |
Describe the bug
When moving an addon from one layer to another, the reconciliation doesn't complete successfully because the old resources are not pruned.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
When I remove an addon from a layer, it will be pruned regardless of the dependency relationship to other layers.
Kraan Helm Chart Version = v0.2.8
Kubernetes Version = v1.19.9
The text was updated successfully, but these errors were encountered: