-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Environment variables to configure (shorten) Informer ResyncPeriods #13690
Comments
That would be a workaround, not a solution. Cache rebuilds are expensive, especially if you have a large amount of Workflows. We leave it at the k8s default, so if it's not tuned in Argo, making it user configurable is a bit confusing, to say the least. There's also one of these for every informer Also please fill out the issue templates in full, especially if you want to be a good role model to others. |
@agilgur5 can u clarify expensive in what terms? (k8s api calls, controller cpu/memory? something else?) that might be preferable than missing SLAs for me from reading kubernetes/kubernetes#127964 and kubernetes/client-go#571 informer seems unreliable compared to list current state so choice seems to be rely on events/cache for what workflows should be operated on (non-0 chance of some missing) vs simple list all workflows (guaranteed to have all) |
All of the above. It can do a full relist, which is k8s API and network I/O expensive, and iterates through the entire cache, which uses CPU and memory. Depending on your usage, you might be able to see the rebuild as a clear spike in your metrics as with #12206 (comment) In #12125 (comment) (I forgot that issue existed, very similar) and #13466 (comment) I linked to some readings upstream in kubernetes-client/java#725 (comment), this k8s SIG API Machinery Google Group thread, argoproj/gitops-engine#617 (comment). According to those, Informers are supposed to be quite stable now and no longer relist, although unclear if that applies outside of "core controllers". I would say it's more an upstream issue if that even makes sense to expose to users, since it seems like k8s maintainers don't recommend changing the default for other tooling either.
that's a bit of a different question that is potentially worth exposing in its own right, although the argument against that would be that if Informers are acting up, your entire cluster is going to be having some problems, not just Argo |
This issue has been automatically marked as stale because it has not had recent activity and needs more information. It will be closed if no further activity occurs. |
/unrotten |
This is still missing information... |
according to kubernetes/kubernetes#128183 (comment) not upstream issue |
argo-workflows/workflow/controller/controller.go
Line 165 in 54621cc
argo-workflows/workflow/controller/controller.go
Line 167 in 54621cc
argo-workflows/workflow/controller/controller.go
Line 170 in 54621cc
argo-workflows/workflow/controller/taskresult.go
Line 29 in 54621cc
shortening might solve #13671 / #10947 (which is linked to a k8s client bug) / #12352
#1038 (comment)
#1416 (comment)
#568 (comment)
#532 (comment)
#3952
argo-workflows/workflow/controller/taskresult.go
Lines 91 to 93 in 54621cc
#4423
The text was updated successfully, but these errors were encountered: