fix: cronOperator/serverResubmitWf retry create workflow on transient error. Fixes #13970 #13971
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #13970
Motivation
To address issue where a cron intiated create workflow could failed due network issue
Modifications
Add retry to util.SubmitWorkflow
The
SubmitWorkflow
is used byargo server.resubmit workflow
andcontroller.cron operator
The original issue was on cron operator, but having retry in the util function allows server to retry resubmit when there's an intermittent issue communicating with the controller, enabling a better user experience.
Verification
The code style is identical to how other transient patterns are written.
We currently don't have a process to force k8s network error, thus hard to write additional test.
We do have a test for transient utility and a successful e2e test is an indication that no issue has been introduced