-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: parse exit code when Outputs is not populated #13228
Conversation
Not sure why you filed a duplicate of #13180... It is once again missing DCO as well |
so after running this for a while it seems to never extract an exit code, but the extra debug logs are handy. whenever exit is templateDefaults:
retryStrategy:
limit: 1
retryPolicy: "Always"
expression: 'lastRetry.status == "Error" or (lastRetry.status == "Failed" and (asInt(lastRetry.exitCode) in [255,137,143] or (asInt(lastRetry.exitCode) in [-1] and int(float(lastRetry.duration)) < 301)))'
backoff:
duration: "75"
factor: 1
maxDuration: "300" here is one that failed first time with exit 1 and retried (which is bad), this wf uses steps
here is one that failed first time with exit 1 and did not try to retry (which is good), this wf does not use steps
ok, now i'm certain the bug is related to how it handles self referenced templates in steps. ie templates:
- name: flow
steps:
- - name: inlinebelow
template: inlinebelow
- name: inlinebelow
....................stuff below when i removed those steps it started getting exit code i expect |
I don't understand why we need to extract the exit code from the message. |
from above tests it is useless, but these debug logs did help uncover steps referencing a template defined in the same yaml are a reliable reproduction of exitcode not being set, maybe i should make issue for that |
this PR did not solve any issue but it helped to investigate exactly what the root cause is, which is now being tracked in #13297 |
Relates to #12572 (comment)
and https://github.com/argoproj/argo-workflows/blame/465c7b6d6abd06a36165955d7fd01d9db2b6a2d4/workflow/controller/operator.go#L1621
Seems more of a workaround than addressing real root cause