-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v3.5.8: workflow shutdown with strategy: Terminate
, but stuck in Running
#13726
Comments
@jswxstw a little changes can update the status of the wf. if label == "false" && (old.IsPodDeleted() || old.FailedOrError()) {
if recentlyDeleted(old) {
woc.log.WithField("nodeID", nodeID).Debug("Wait for marking task result as completed because pod is recently deleted.")
// If the pod was deleted, then it is possible that the controller never get another informer message about it.
// In this case, the workflow will only be requeued after the resync period (20m). This means
// workflow will not update for 20m. Requeuing here prevents that happening.
woc.requeue()
continue
} else {
woc.log.WithField("nodeID", nodeID).Info("Marking task result as completed because pod has been deleted for a while.")
woc.wf.Status.MarkTaskResultComplete(nodeID)
}
} |
This should be fixed in 3.5.11. |
@Joibel Coud you paste the pr links? |
@jswxstw I had cherry-pick the pr to the v3.5.8, can not fix the problem |
the status of the pod is not |
I'll check it out later. |
@zhucan Please check if you have RBAC problem(see #13537 (comment)), the controller will rely on The root cause may be as below:
|
Running
Running
workflow shutdown with strategy: Terminate
, but stuck in Running
You checked this off, but did not test with
You also did not provide a reproduction nor logs, which makes this difficult if not impossible to investigate. Please fill out the issue template accurately and in-full, it is there for a reason. It is not optional.
I've told you this before, that means you're running a fork, and we don't support forks (that's not possible by definition). You can file an issue in that fork. |
I had checked the logs of the controller, there is no rbac warning informations. @jswxstw |
we couldn't always to upgrade the version to the latest when there are some bugs exists under the version; we need to know which pr fix it, not upgrade the version when there is bug. because we don't know the new version whether exists other bugs. if you couldn't help to do it, no neeed to answer the question. |
Failed directly,but the error messages is not pod deleted but it is workflow shutdown with strategy: Terminate , the status is same but error messages is not same. @jswxstw
|
The issue template asks that you, at minimum, check whether
You could say this of literally any software. Virtually all software has bugs. If you were to follow this and fork every dependency of yours, you wouldn't be doing anything other than dependency management (that is a big part of software development these days, but usually not the only thing). You're using Argo as a dependency, so if you update other dependencies to fix bugs, you would do the same with Argo.
That's not how OSS works -- you filed a bug report for a fork to the origin. Your bug report is therefore invalid as this is not that fork. If you want support for a fork, you can pay a vendor for that. You should not expect community support from the origin for your own fork; that is neither possible (by definition) nor sustainable. |
workflow shutdown with strategy: Terminate
, but stuck in Running
workflow shutdown with strategy: Terminate
, but stuck in Running
workflow shutdown with strategy: Terminate
, but stuck in Running
workflow shutdown with strategy: Terminate
, but stuck in Running
@zhucan This is a fix for #12993, #13533, which caused the waiting container to exit abnormally due to pod deletion. There are two releated pr: #13454, #13537. Workflow shutdown will not cause wait container exiting abnormally, so this issue should not exist in v3.5.8. I can't help more, since you provided very little information. |
Pre-requisites
:latest
image tag (i.e.quay.io/argoproj/workflow-controller:latest
) and can confirm the issue still exists on:latest
. If not, I have explained why, in detail, in my description below.What happened? What did you expect to happen?
workflow shutdown with strategy: Terminate
, but the status of the workflow stuck running stateI expect the taskresults to be completed and the status of workflow not stuck Running state
Version(s)
v3.5.8
Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.
I don't remmber how to reproduce it.
Logs from the workflow controller
Logs from in your workflow's wait container
The text was updated successfully, but these errors were encountered: