-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(operator): allow retries to consider exit code from init container and don't consider node as pending if init failed. Fixes #11354/#10717/#10045 #13858
base: main
Are you sure you want to change the base?
Conversation
This will only be encountered when using |
no, see the logs/comments in the linked issues |
We do regularly observe this issue here at work too for 3.5.12 |
We face this issue regularly, adding retry might very help. |
@tooptoop4 I think your PR does not truly resolve this issue, because the nodes in the containerSet are initialized to
|
@jswxstw this has nothing to do with containerset, i don't run that workflow type |
But the line of code you posted here is only relevant to the issue with argo-workflows/workflow/controller/operator.go Line 1404 in 4742e9d
Also, if you are not using |
that block is not limited to containerSet |
Please provide the workflow you tested to prove it. |
@toralf @epifanov6 Could you please clarify whether the issue you encountered is related to the missing exit code of the init container, or is it that the node is |
see linked issues for all the details |
@jswxstw Init container is in state "Terminated", because of the reason "Error". |
@epifanov6 The node should be |
Fixes #11354 and #10717 and #10045
Before this fix it would always go into pending because main container was waiting state (
argo-workflows/workflow/controller/operator.go
Line 1404 in 4742e9d
This supersedes #13852
cc @terrytangyuan