Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP Template with Retry Strategy temporarily hangs after request #13444

Closed
3 of 4 tasks
wesleyscholl opened this issue Aug 8, 2024 · 5 comments
Closed
3 of 4 tasks
Labels
area/controller Controller issues, panics area/templates/http solution/duplicate This issue or PR is a duplicate of an existing one type/bug type/regression Regression from previous behavior (a specific type of bug)

Comments

@wesleyscholl
Copy link

Pre-requisites

  • I have double-checked my configuration
  • I have tested with the :latest image tag (i.e. quay.io/argoproj/workflow-controller:latest) and can confirm the issue still exists on :latest. If not, I have explained why, in detail, in my description below.
  • I have searched existing issues and could not find a match for this bug
  • I'd like to contribute the fix myself (see contributing guide)

What happened? What did you expect to happen?

HTTP Template With Retry Strategy Hangs After Successful Or Failed Request

After upgrading to v3.5.5 (solution to issue #11889), our workflows using HTTP templates hang for random amounts of time. Ranging between 5-30+ minutes before continuing. Tested in multiple environments. I think it has something to do with the task set, but I can't be sure. Any ideas?

Screenshot 2024-08-08 at 6 47 20 PM

#11889

Version(s)

v3.5.5

Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

metadata:
  name: nss-test-failure-test
spec:
  templates:
    - name: start-test-fail
      inputs: {}
      outputs: {}
      metadata: {}
      steps:
        - - name: first-step
            template: http-retry
            arguments:
              parameters:
                - name: url
                  value: http://httpstat.us/Random/200
        - - name: get-status
            template: http-retry
            arguments:
              parameters:
                - name: url
                  value: http://httpstat.us/Random/400-404,500-504
    - name: http-retry
      inputs:
        parameters:
          - name: url
      outputs: {}
      metadata: {}
      http:
        method: GET
        url: '{{inputs.parameters.url}}'
        timeoutSeconds: 20
        successCondition: response.statusCode == 200
      retryStrategy:
        limit: 1
        retryPolicy: Always

Logs from the workflow controller

kubectl logs -n argo deploy/workflow-controller | grep ${workflow}

 kubectl logs -n argo argo-argo-workflows-workflow-controller | grep nss-test-failure-test-qgr6n
time="2024-08-08T22:13:14.881Z" level=info msg="Processing workflow" Phase= ResourceVersion=1046286529 namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:14.918Z" level=info msg="Task-result reconciliation" namespace=new-store-setup numObjs=0 workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:14.918Z" level=info msg="Updated phase  -> Running" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:14.918Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:14.919Z" level=info msg="was unable to obtain node for , letting display name to be nodeName" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:14.919Z" level=info msg="Steps node nss-test-failure-test-qgr6n initialized Running" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:14.919Z" level=info msg="StepGroup node nss-test-failure-test-qgr6n-614207271 initialized Running" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:14.919Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:14.919Z" level=info msg="Retry node nss-test-failure-test-qgr6n-3810216484 initialized Running" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:14.919Z" level=info msg="HTTP node nss-test-failure-test-qgr6n-2238001551 initialized Pending" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:14.919Z" level=info msg="Workflow step group node nss-test-failure-test-qgr6n-614207271 not yet completed" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:14.919Z" level=info msg="TaskSet Reconciliation" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:14.919Z" level=info msg="Creating TaskSet" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:14.935Z" level=info msg=reconcileAgentPod namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:15.002Z" level=info msg="Created Agent pod" namespace=new-store-setup podName=nss-test-failure-test-qgr6n-1340600742-agent workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:15.002Z" level=info msg=updateAgentPodStatus namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:15.002Z" level=info msg=assessAgentPodStatus namespace=new-store-setup podName=nss-test-failure-test-qgr6n-1340600742-agent
time="2024-08-08T22:13:15.024Z" level=info msg="Workflow update successful" namespace=new-store-setup phase=Running resourceVersion=1046286542 workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:15.092Z" level=warning msg="error updating taskset" error="failed patching taskset: workflowtasksets.argoproj.io \"nss-test-failure-test-qgr6n\" is forbidden: User \"system:serviceaccount:argo:argo-argo-workflows-workflow-controller\" cannot patch resource \"workflowtasksets/status\" in API group \"argoproj.io\" in the namespace \"new-store-setup\": Azure does not have opinion for this user." namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:25.004Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=1046286542 namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:25.004Z" level=info msg="Task-result reconciliation" namespace=new-store-setup numObjs=0 workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:25.004Z" level=info msg=updateAgentPodStatus namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:25.004Z" level=info msg=assessAgentPodStatus namespace=new-store-setup podName=nss-test-failure-test-qgr6n-1340600742-agent
time="2024-08-08T22:13:25.004Z" level=error msg="was unable to obtain node for nss-test-failure-test-qgr6n-2166136261" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:25.004Z" level=info msg="Workflow step group node nss-test-failure-test-qgr6n-614207271 not yet completed" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:25.005Z" level=info msg="TaskSet Reconciliation" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:25.005Z" level=info msg="Creating TaskSet" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:25.033Z" level=info msg=reconcileAgentPod namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:25.033Z" level=info msg=updateAgentPodStatus namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:25.033Z" level=info msg=assessAgentPodStatus namespace=new-store-setup podName=nss-test-failure-test-qgr6n-1340600742-agent
time="2024-08-08T22:13:35.033Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=1046286542 namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:35.033Z" level=info msg="Task-result reconciliation" namespace=new-store-setup numObjs=0 workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:35.033Z" level=info msg=updateAgentPodStatus namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:35.033Z" level=info msg=assessAgentPodStatus namespace=new-store-setup podName=nss-test-failure-test-qgr6n-1340600742-agent
time="2024-08-08T22:13:35.033Z" level=error msg="was unable to obtain node for nss-test-failure-test-qgr6n-2166136261" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:35.034Z" level=info msg="Workflow step group node nss-test-failure-test-qgr6n-614207271 not yet completed" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:35.034Z" level=info msg="TaskSet Reconciliation" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:35.034Z" level=info msg="Creating TaskSet" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:35.065Z" level=info msg=reconcileAgentPod namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:35.065Z" level=info msg=updateAgentPodStatus namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:35.065Z" level=info msg=assessAgentPodStatus namespace=new-store-setup podName=nss-test-failure-test-qgr6n-1340600742-agent
time="2024-08-08T22:13:35.091Z" level=info msg="Workflow update successful" namespace=new-store-setup phase=Running resourceVersion=1046287063 workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:13:35.095Z" level=warning msg="error updating taskset" error="failed patching taskset: workflowtasksets.argoproj.io \"nss-test-failure-test-qgr6n\" is forbidden: User \"system:serviceaccount:argo:argo-argo-workflows-workflow-controller\" cannot patch resource \"workflowtasksets/status\" in API group \"argoproj.io\" in the namespace \"new-store-setup\": Azure does not have opinion for this user." namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.487Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=1046287063 namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.487Z" level=info msg="Task-result reconciliation" namespace=new-store-setup numObjs=0 workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.487Z" level=info msg=updateAgentPodStatus namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.487Z" level=info msg=assessAgentPodStatus namespace=new-store-setup podName=nss-test-failure-test-qgr6n-1340600742-agent
time="2024-08-08T22:19:42.487Z" level=error msg="was unable to obtain node for nss-test-failure-test-qgr6n-2166136261" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.487Z" level=info msg="node nss-test-failure-test-qgr6n-3810216484 phase Running -> Succeeded" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.487Z" level=info msg="node nss-test-failure-test-qgr6n-3810216484 finished: 2024-08-08 22:19:42.487834837 +0000 UTC" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.487Z" level=info msg="Step group node nss-test-failure-test-qgr6n-614207271 successful" namespace=new-store-setup workflow=nss-test-failure-test-qgr6
time="2024-08-08T22:19:42.487Z" level=info msg="node nss-test-failure-test-qgr6n-614207271 phase Running -> Succeeded" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.487Z" level=info msg="node nss-test-failure-test-qgr6n-614207271 finished: 2024-08-08 22:19:42.487884971 +0000 UTC" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.487Z" level=info msg="StepGroup node nss-test-failure-test-qgr6n-681464842 initialized Running" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.487Z" level=info msg="SG Outbound nodes of nss-test-failure-test-qgr6n-3810216484 are [nss-test-failure-test-qgr6n-2238001551]" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.488Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.488Z" level=info msg="Retry node nss-test-failure-test-qgr6n-3226656409 initialized Running" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.488Z" level=info msg="HTTP node nss-test-failure-test-qgr6n-2983926208 initialized Pending" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.488Z" level=info msg="Workflow step group node nss-test-failure-test-qgr6n-681464842 not yet completed" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.488Z" level=info msg="TaskSet Reconciliation" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.488Z" level=info msg="Creating TaskSet" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.525Z" level=info msg=reconcileAgentPod namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.525Z" level=info msg=updateAgentPodStatus namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.525Z" level=info msg=assessAgentPodStatus namespace=new-store-setup podName=nss-test-failure-test-qgr6n-1340600742-agent
time="2024-08-08T22:19:42.552Z" level=info msg="Workflow update successful" namespace=new-store-setup phase=Running resourceVersion=1046297180 workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:42.557Z" level=warning msg="error updating taskset" error="failed patching taskset: workflowtasksets.argoproj.io \"nss-test-failure-test-qgr6n\" is forbidden: User \"system:serviceaccount:argo:argo-argo-workflows-workflow-controller\" cannot patch resource \"workflowtasksets/status\" in API group \"argoproj.io\" in the namespace \"new-store-setup\": Azure does not have opinion for this user." namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:52.526Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=1046297180 namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:52.526Z" level=info msg="Task-result reconciliation" namespace=new-store-setup numObjs=0 workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:52.526Z" level=info msg=updateAgentPodStatus namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:52.526Z" level=info msg=assessAgentPodStatus namespace=new-store-setup podName=nss-test-failure-test-qgr6n-1340600742-agent
time="2024-08-08T22:19:52.526Z" level=error msg="was unable to obtain node for nss-test-failure-test-qgr6n-2166136261" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:52.526Z" level=info msg="SG Outbound nodes of nss-test-failure-test-qgr6n-3810216484 are [nss-test-failure-test-qgr6n-2238001551]" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:52.526Z" level=info msg="Workflow step group node nss-test-failure-test-qgr6n-681464842 not yet completed" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:52.526Z" level=info msg="TaskSet Reconciliation" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:52.526Z" level=info msg="Creating TaskSet" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:52.564Z" level=info msg=reconcileAgentPod namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:52.564Z" level=info msg=updateAgentPodStatus namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:52.564Z" level=info msg=assessAgentPodStatus namespace=new-store-setup podName=nss-test-failure-test-qgr6n-1340600742-agent
time="2024-08-08T22:19:52.611Z" level=info msg="Workflow update successful" namespace=new-store-setup phase=Running resourceVersion=1046297421 workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:19:52.619Z" level=warning msg="error updating taskset" error="failed patching taskset: workflowtasksets.argoproj.io \"nss-test-failure-test-qgr6n\" is forbidden: User \"system:serviceaccount:argo:argo-argo-workflows-workflow-controller\" cannot patch resource \"workflowtasksets/status\" in API group \"argoproj.io\" in the namespace \"new-store-setup\": Azure does not have opinion for this user." namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:42.489Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=1046297421 namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:42.489Z" level=info msg="Task-result reconciliation" namespace=new-store-setup numObjs=0 workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:42.489Z" level=info msg=updateAgentPodStatus namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:42.489Z" level=info msg=assessAgentPodStatus namespace=new-store-setup podName=nss-test-failure-test-qgr6n-1340600742-agent
time="2024-08-08T22:39:42.489Z" level=error msg="was unable to obtain node for nss-test-failure-test-qgr6n-2166136261" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:42.489Z" level=info msg="SG Outbound nodes of nss-test-failure-test-qgr6n-3810216484 are [nss-test-failure-test-qgr6n-2238001551]" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:42.489Z" level=info msg="Retry Policy: Always (onFailed: true, onError true)" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:42.489Z" level=info msg="1 child nodes of nss-test-failure-test-qgr6n[1].get-status failed. Trying again..." namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:42.490Z" level=info msg="HTTP node nss-test-failure-test-qgr6n-3721994349 initialized Pending" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:42.490Z" level=info msg="Workflow step group node nss-test-failure-test-qgr6n-681464842 not yet completed" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:42.490Z" level=info msg="TaskSet Reconciliation" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:42.490Z" level=info msg="Creating TaskSet" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:42.533Z" level=info msg=reconcileAgentPod namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:42.533Z" level=info msg=updateAgentPodStatus namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:42.533Z" level=info msg=assessAgentPodStatus namespace=new-store-setup podName=nss-test-failure-test-qgr6n-1340600742-agent
time="2024-08-08T22:39:42.559Z" level=info msg="Workflow update successful" namespace=new-store-setup phase=Running resourceVersion=1046330093 workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:42.563Z" level=warning msg="error updating taskset" error="failed patching taskset: workflowtasksets.argoproj.io \"nss-test-failure-test-qgr6n\" is forbidden: User \"system:serviceaccount:argo:argo-argo-workflows-workflow-controller\" cannot patch resource \"workflowtasksets/status\" in API group \"argoproj.io\" in the namespace \"new-store-setup\": Azure does not have opinion for this user." namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:52.534Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=1046330093 namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:52.534Z" level=info msg="Task-result reconciliation" namespace=new-store-setup numObjs=0 workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:52.534Z" level=info msg=updateAgentPodStatus namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:52.535Z" level=info msg=assessAgentPodStatus namespace=new-store-setup podName=nss-test-failure-test-qgr6n-1340600742-agent
time="2024-08-08T22:39:52.535Z" level=error msg="was unable to obtain node for nss-test-failure-test-qgr6n-2166136261" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:52.535Z" level=info msg="SG Outbound nodes of nss-test-failure-test-qgr6n-3810216484 are [nss-test-failure-test-qgr6n-2238001551]" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:52.535Z" level=info msg="Workflow step group node nss-test-failure-test-qgr6n-681464842 not yet completed" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:52.535Z" level=info msg="TaskSet Reconciliation" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:52.535Z" level=info msg="Creating TaskSet" namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:52.563Z" level=info msg=reconcileAgentPod namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:52.563Z" level=info msg=updateAgentPodStatus namespace=new-store-setup workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:52.563Z" level=info msg=assessAgentPodStatus namespace=new-store-setup podName=nss-test-failure-test-qgr6n-1340600742-agent
time="2024-08-08T22:39:52.592Z" level=info msg="Workflow update successful" namespace=new-store-setup phase=Running resourceVersion=1046330363 workflow=nss-test-failure-test-qgr6n
time="2024-08-08T22:39:52.597Z" level=warning msg="error updating taskset" error="failed patching taskset: workflowtasksets.argoproj.io \"nss-test-failure-test-qgr6n\" is forbidden: User \"system:serviceaccount:argo:argo-argo-workflows-workflow-controller\" cannot patch resource \"workflowtasksets/status\" in API group \"argoproj.io\" in the namespace \"new-store-setup\": Azure does not have opinion for this user." namespace=new-store-setup workflow=nss-test-failure-test-qgr6n

Logs from in your workflow's wait container

kubectl logs -n argo -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded


kubectl logs -n new-store-setup -c wait nss-test-failure-test-qgr6n-1340600742-agent 
error: container wait is not valid for pod nss-test-failure-test-qgr6n-1340600742-agent
@agilgur5 agilgur5 added the solution/outdated This is not up-to-date with the current version label Aug 8, 2024
@agilgur5 agilgur5 changed the title BUG: HTTP Template With Retry Strategy Hangs After Successful/Failed Request HTTP Template With Retry Strategy Hangs After Successful/Failed Request Aug 8, 2024
@agilgur5 agilgur5 changed the title HTTP Template With Retry Strategy Hangs After Successful/Failed Request HTTP Template With Retry Strategy Hangs Request Aug 8, 2024
@agilgur5 agilgur5 changed the title HTTP Template With Retry Strategy Hangs Request HTTP Template with Retry Strategy hangs after request Aug 8, 2024
@agilgur5 agilgur5 changed the title HTTP Template with Retry Strategy hangs after request HTTP Template with Retry Strategy temporarily hangs after request Aug 8, 2024
@agilgur5 agilgur5 added area/templates/http area/retryStrategy Template-level retryStrategy labels Aug 8, 2024
@agilgur5
Copy link

agilgur5 commented Aug 8, 2024

v3.5.5

Please try with at least v3.5.10 or :latest, as the issue template requires

time="2024-08-08T22:39:52.597Z" level=warning msg="error updating taskset" error="failed patching taskset: workflowtasksets.argoproj.io \"nss-test-failure-test-qgr6n\" is forbidden: User \"system:serviceaccount:argo:argo-argo-workflows-workflow-controller\" cannot patch resource \"workflowtasksets/status\" in API group \"argoproj.io\" in the namespace \"new-store-setup\": Azure does not have opinion for this user." namespace=new-store-setup workflow=nss-test-failure-test-qgr6n

This warning sounds pretty relevant. It also duplicates #13341. Were you not getting this warning in v3.4?

@agilgur5 agilgur5 added area/controller Controller issues, panics solution/duplicate This issue or PR is a duplicate of an existing one and removed area/retryStrategy Template-level retryStrategy labels Aug 8, 2024
@wesleyscholl
Copy link
Author

wesleyscholl commented Aug 9, 2024

Thanks for the response. We have fixed this issue. The service account did not have the workflowtasksets/status permission. After adding workflowtasksets/status, the http templates functioned as expected.

@agilgur5
Copy link

Which SA in particular?

@wesleyscholl
Copy link
Author

wesleyscholl commented Aug 10, 2024

system:serviceaccount:argo:argo-argo-workflows-workflow-controller

@wesleyscholl
Copy link
Author

Were you not getting this warning in v3.4?

And we were not getting this warning or issue in v3.5.4.

@agilgur5 agilgur5 added type/regression Regression from previous behavior (a specific type of bug) and removed solution/outdated This is not up-to-date with the current version labels Aug 10, 2024
@agilgur5 agilgur5 added this to the v3.5.x patches milestone Aug 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/controller Controller issues, panics area/templates/http solution/duplicate This issue or PR is a duplicate of an existing one type/bug type/regression Regression from previous behavior (a specific type of bug)
Projects
None yet
Development

No branches or pull requests

2 participants