-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot delete failed Workflow due to ArtifactGC finalizer #13499
Comments
apiVersion: v1
kind: Pod
metadata:
annotations:
kubectl.kubernetes.io/default-container: main
workflows.argoproj.io/node-id: artifact-gc-qzcrv
workflows.argoproj.io/node-name: artifact-gc-qzcrv
creationTimestamp: "2024-08-24T10:08:30Z"
labels:
workflows.argoproj.io/completed: "true"
workflows.argoproj.io/workflow: artifact-gc-qzcrv
name: artifact-gc-qzcrv
namespace: default
argo-workflows/workflow/controller/controller.go Lines 1232 to 1238 in ddbb3c7
This causes anyPodSuccess to always be false argo-workflows/workflow/controller/artifact_gc.go Lines 506 to 512 in ddbb3c7
|
See my review comment. This seems like the exact use-case for the |
Looks like this might be caused by failure of TaskResult reconciliation (so ArtifactGC didn't run yet), and so would have a duplicate root cause of #12993 and fixed by #13454. See my new comment on the PR |
Try running your Controller image with |
I have tried it. In fact, I have debugged it locally with the latest code and it cannot be deleted successfully unless
|
To clarify, so in this scenario, no ArtifactGC Pods were launched? Since no artifacts were created
Ah, I see what you mean, thanks for elaborating! Yes, if all artifacts were already deleted, then I think this check is just to make sure that there are no failed ArtifactGC Pods laying around? 🤔 cc @juliev0 |
Hey all. This is really interesting. You've definitely happened upon at least one bug. But you know, while the (in which case, we can decide if it should even have that label) |
Looking at the |
Now I'm seeing in your PR that you mention that the Workflow failed due to not being able to run the image. Okay, let me respond over there... So, it seems like root cause is really that you never had any ArtifactGC Pods to begin with, right? And the |
#13499 (#13500) Signed-off-by: joey <[email protected]>
Pre-requisites
:latest
image tag (i.e.quay.io/argoproj/workflow-controller:latest
) and can confirm the issue still exists on:latest
. If not, I have explained why, in detail, in my description below.What happened? What did you expect to happen?
run example
examples/artifact-gc-workflow.yaml
kubectl create -f examples/artifact-gc-workflow.yaml
get workflow:
i found that
finalizers
still existed, it should be removedVersion(s)
v3.5.10
Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.
run
examples/artifact-gc-workflow.yaml
on arm mac, and then delete the workflowLogs from the workflow controller
Logs from in your workflow's wait container
The text was updated successfully, but these errors were encountered: