Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker dind sidecar iptables issue #3159

Open
7 tasks done
iamcaleberic opened this issue Dec 15, 2023 · 22 comments
Open
7 tasks done

docker dind sidecar iptables issue #3159

iamcaleberic opened this issue Dec 15, 2023 · 22 comments
Labels
bug Something isn't working community Community contribution needs triage Requires review from the maintainers

Comments

@iamcaleberic
Copy link

iamcaleberic commented Dec 15, 2023

Checks

Controller Version

v0.27.6

Helm Chart Version

0.23.6

CertManager Version

1.13.2

Deployment Method

Helm

cert-manager installation

Are you sure youve install cert-manager from an official source?
yes using official jetstack helm repo

Checks

  • This isn't a question or user support case (For Q&A and community support, go to Discussions. It might also be a good idea to contract with any of contributors and maintainers if your business is so critical and therefore you need priority support
  • I've read releasenotes before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes
  • My actions-runner-controller version (v0.x.y) does support the feature
  • I've already upgraded ARC (including the CRDs, see charts/actions-runner-controller/docs/UPGRADING.md for details) to the latest and it didn't fix the issue
  • I've migrated to the workflow job webhook event (if you using webhook driven scaling)

Resource Definitions

apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: xxx-runnerdeploy
  namespace: actions-runner-system
spec:
  replicas: 3
  template:
    spec:
      repository: xxx/redacted

To Reproduce

Create a new RunnerDeployment

Describe the bug

The docker dind sidecar errors out and does not start and the runner pods ends up restarting every 120 secs, this is the timeout for docker.

Might be related to

docker-library/docker@4c2674d
docker-library/docker#437

Describe the expected behavior

dind sidecar to start.

Whole Controller Logs

manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    received request    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "14410137-32a5-433f-aeeb-ba1ea3b75b02", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    wrote response    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "14410137-32a5-433f-aeeb-ba1ea3b75b02", "allowed": true}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    received request    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "639578c7-6c69-4e40-ab71-c8dc21eeb9ab", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
manager 2023-12-15T14:24:50Z    INFO    runner-resource    validate resource to be updated    {"name": "repo-provisioner-runnerdeploy-m9v2d-jhlp9"}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    wrote response    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "639578c7-6c69-4e40-ab71-c8dc21eeb9ab", "allowed": true}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    received request    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "00c0ff92-0984-4a56-aeba-8ab9e4b948b4", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    wrote response    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "00c0ff92-0984-4a56-aeba-8ab9e4b948b4", "allowed": true}
manager 2023-12-15T14:24:50Z    INFO    runner    Removed finalizer    {"runner": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-jhlp9"}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    received request    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "195dc35b-a194-4ba9-9865-dc4fef86cce1", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
manager 2023-12-15T14:24:50Z    INFO    runner-resource    validate resource to be created    {"name": "repo-provisioner-runnerdeploy-m9v2d-lmgwf"}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    wrote response    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "195dc35b-a194-4ba9-9865-dc4fef86cce1", "allowed": true}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    received request    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "57ddbdb8-11de-4777-8ad3-26a32f12a01c", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    wrote response    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "57ddbdb8-11de-4777-8ad3-26a32f12a01c", "allowed": true}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    received request    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "861cf2af-c4f4-4853-ac52-c435d65cecd3", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
manager 2023-12-15T14:24:50Z    INFO    runner-resource    validate resource to be updated    {"name": "repo-provisioner-runnerdeploy-m9v2d-7xfgj"}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    wrote response    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "861cf2af-c4f4-4853-ac52-c435d65cecd3", "allowed": true}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    received request    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "a407fc97-9e13-47ae-a360-53dde8d4aaa6", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    wrote response    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "a407fc97-9e13-47ae-a360-53dde8d4aaa6", "allowed": true}
manager 2023-12-15T14:24:50Z    INFO    runner    Removed finalizer    {"runner": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-7xfgj"}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    received request    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "55a1019b-292e-4f11-8468-d92bf925a562", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
manager 2023-12-15T14:24:50Z    INFO    runner-resource    validate resource to be created    {"name": "repo-provisioner-runnerdeploy-m9v2d-g4j94"}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    wrote response    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "55a1019b-292e-4f11-8468-d92bf925a562", "allowed": true}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    received request    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "a0507a08-f8a9-4fbe-8fee-af0557d5c51c", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    wrote response    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "a0507a08-f8a9-4fbe-8fee-af0557d5c51c", "allowed": true}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    received request    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "e9065570-8cad-41c0-bb55-229de37621af", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
manager 2023-12-15T14:24:50Z    INFO    runner-resource    validate resource to be updated    {"name": "repo-provisioner-runnerdeploy-m9v2d-lmgwf"}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    wrote response    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "e9065570-8cad-41c0-bb55-229de37621af", "allowed": true}
manager 2023-12-15T14:24:50Z    DEBUG    runnerreplicaset    Created replica(s)    {"runnerreplicaset": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d", "lastSyncTime": "2023-12-15T14:24:07Z", "effectiveTime": "<nil>", "templateHashDesired": "5646cf87b7", "replicasDesired": 3, "replicasPending": 0, "replicasRunning": 1, "replicasMaybeRunning": 1, "templateHashObserved": ["5646cf87b7"], "created": 2}
manager 2023-12-15T14:24:50Z    DEBUG    runnerreplicaset    Skipped reconcilation because owner is not synced yet    {"runnerreplicaset": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d", "owner": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-lmgwf", "pods": null}
manager 2023-12-15T14:24:50Z    DEBUG    runnerreplicaset    Skipped reconcilation because owner is not synced yet    {"runnerreplicaset": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d", "owner": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-lmgwf", "pods": null}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    received request    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "UID": "4b5759c5-7465-4e73-9ab6-6486b2e9a09c", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    wrote response    {"webhook": "/mutate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "4b5759c5-7465-4e73-9ab6-6486b2e9a09c", "allowed": true}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    received request    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "UID": "0aebc261-1922-491d-a79a-1cfe9658ca50", "kind": "actions.summerwind.dev/v1alpha1, Kind=Runner", "resource": {"group":"actions.summerwind.dev","version":"v1alpha1","resource":"runners"}}
manager 2023-12-15T14:24:50Z    INFO    runner-resource    validate resource to be updated    {"name": "repo-provisioner-runnerdeploy-m9v2d-g4j94"}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    wrote response    {"webhook": "/validate-actions-summerwind-dev-v1alpha1-runner", "code": 200, "reason": "", "UID": "0aebc261-1922-491d-a79a-1cfe9658ca50", "allowed": true}
manager 2023-12-15T14:24:50Z    DEBUG    runnerreplicaset    Skipped reconcilation because owner is not synced yet    {"runnerreplicaset": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d", "owner": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-lmgwf", "pods": null}
manager 2023-12-15T14:24:50Z    INFO    runner    Updated registration token    {"runner": "repo-provisioner-runnerdeploy-m9v2d-lmgwf", "repository": "org/repo-provisioner"}
manager 2023-12-15T14:24:50Z    INFO    runnerpod    Runner pod has been stopped with a successful status.    {"runnerpod": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-7xfgj"}
manager 2023-12-15T14:24:50Z    DEBUG    runnerreplicaset    Skipped reconcilation because owner is not synced yet    {"runnerreplicaset": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d", "owner": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-lmgwf", "pods": null}
manager 2023-12-15T14:24:50Z    DEBUG    events    Successfully update registration token    {"type": "Normal", "object": {"kind":"Runner","namespace":"actions-runner-system","name":"repo-provisioner-runnerdeploy-m9v2d-lmgwf","uid":"352bedde-f17b-451e-affd-83fabd954a0c","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"294112600"}, "reason": "RegistrationTokenUpdated"}
manager 2023-12-15T14:24:50Z    DEBUG    runnerreplicaset    Skipped reconcilation because owner is not synced yet    {"runnerreplicaset": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d", "owner": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-lmgwf", "pods": null}
manager 2023-12-15T14:24:50Z    INFO    runner    Updated registration token    {"runner": "repo-provisioner-runnerdeploy-m9v2d-g4j94", "repository": "org/repo-provisioner"}
manager 2023-12-15T14:24:50Z    DEBUG    events    Successfully update registration token    {"type": "Normal", "object": {"kind":"Runner","namespace":"actions-runner-system","name":"repo-provisioner-runnerdeploy-m9v2d-g4j94","uid":"f4cdf000-243d-436a-b978-3dfa15751787","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"294112603"}, "reason": "RegistrationTokenUpdated"}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    received request    {"webhook": "/mutate-runner-set-pod", "UID": "36fb173f-3faf-4ab1-af9f-1295c0aac5a1", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    wrote response    {"webhook": "/mutate-runner-set-pod", "code": 200, "reason": "", "UID": "36fb173f-3faf-4ab1-af9f-1295c0aac5a1", "allowed": true}
manager 2023-12-15T14:24:50Z    INFO    runner    Created runner pod    {"runner": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-lmgwf", "repository": "org/repo-provisioner"}
manager 2023-12-15T14:24:50Z    DEBUG    events    Created pod 'repo-provisioner-runnerdeploy-m9v2d-lmgwf'    {"type": "Normal", "object": {"kind":"Runner","namespace":"actions-runner-system","name":"repo-provisioner-runnerdeploy-m9v2d-lmgwf","uid":"352bedde-f17b-451e-affd-83fabd954a0c","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"294112604"}, "reason": "PodCreated"}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    received request    {"webhook": "/mutate-runner-set-pod", "UID": "896f3381-b9e7-41dd-90dd-54eb3c7ed2dc", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
manager 2023-12-15T14:24:50Z    DEBUG    controller-runtime.webhook.webhooks    wrote response    {"webhook": "/mutate-runner-set-pod", "code": 200, "reason": "", "UID": "896f3381-b9e7-41dd-90dd-54eb3c7ed2dc", "allowed": true}
manager 2023-12-15T14:24:50Z    INFO    runner    Created runner pod    {"runner": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-g4j94", "repository": "org/repo-provisioner"}
manager 2023-12-15T14:24:50Z    INFO    runnerpod    Runner pod has been stopped with a successful status.    {"runnerpod": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-jhlp9"}
manager 2023-12-15T14:24:50Z    DEBUG    events    Created pod 'repo-provisioner-runnerdeploy-m9v2d-g4j94'    {"type": "Normal", "object": {"kind":"Runner","namespace":"actions-runner-system","name":"repo-provisioner-runnerdeploy-m9v2d-g4j94","uid":"f4cdf000-243d-436a-b978-3dfa15751787","apiVersion":"actions.summerwind.dev/v1alpha1","resourceVersion":"294112606"}, "reason": "PodCreated"}
manager 2023-12-15T14:24:50Z    DEBUG    runnerreplicaset    Skipped reconcilation because owner is not synced yet    {"runnerreplicaset": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d", "owner": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-g4j94", "pods": [{"kind":"Pod","apiVersion":"v1","metadata":{"name":"repo-provisioner-runnerdeploy-m9v2d-g4j94","namespace":"actions-runner-system","uid":"c90f53fa-9df9-4b2d-aac4-ed50ad13c6be","resourceVersion":"294112615","creationTimestamp":"2023-12-15T14:24:50Z","labels":{"actions-runner":"","actions-runner-controller/inject-registration-token":"true","pod-template-hash":"765978d77b","runner-deployment-name":"repo-provisioner-runnerdeploy","runner-template-hash":"5646cf87b7"},"annotations":{"actions-runner-controller/token-expires-at":"2023-12-15T16:21:57+01:00","sync-time":"2023-12-15T14:24:50Z"},"ownerReferences":[{"apiVersion":"actions.summerwind.dev/v1alpha1","kind":"Runner","name":"repo-provisioner-runnerdeploy-m9v2d-g4j94","uid":"f4cdf000-243d-436a-b978-3dfa15751787","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"manager","operation":"Update","apiVersion":"v1","time":"2023-12-15T14:24:50Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:annotations":{".":{},"f:sync-time":{}},"f:labels":{".":{},"f:actions-runner":{},"f:actions-runner-controller/inject-registration-token":{},"f:pod-template-hash":{},"f:runner-deployment-name":{},"f:runner-template-hash":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"f4cdf000-243d-436a-b978-3dfa15751787\"}":{}}},"f:spec":{"f:containers":{"k:{\"name\":\"docker\"}":{".":{},"f:args":{},"f:env":{".":{},"k:{\"name\":\"DOCKER_GROUP_GID\"}":{".":{},"f:name":{},"f:value":{}}},"f:image":{},"f:imagePullPolicy":{},"f:lifecycle":{".":{},"f:preStop":{".":{},"f:exec":{".":{},"f:command":{}}}},"f:name":{},"f:resources":{},"f:securityContext":{".":{},"f:privileged":{}},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{},"f:volumeMounts":{".":{},"k:{\"mountPath\":\"/run\"}":{".":{},"f:mountPath":{},"f:name":{}},"k:{\"mountPath\":\"/runner\"}":{".":{},"f:mountPath":{},"f:name":{}},"k:{\"mountPath\":\"/runner/_work\"}":{".":{},"f:mountPath":{},"f:name":{}}}},"k:{\"name\":\"runner\"}":{".":{},"f:env":{".":{},"k:{\"name\":\"DOCKERD_IN_RUNNER\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"DOCKER_ENABLED\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"DOCKER_HOST\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"GITHUB_ACTIONS_RUNNER_EXTRA_USER_AGENT\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"GITHUB_URL\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_ENTERPRISE\"}":{".":{},"f:name":{}},"k:{\"name\":\"RUNNER_EPHEMERAL\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_GROUP\"}":{".":{},"f:name":{}},"k:{\"name\":\"RUNNER_LABELS\"}":{".":{},"f:name":{}},"k:{\"name\":\"RUNNER_NAME\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_ORG\"}":{".":{},"f:name":{}},"k:{\"name\":\"RUNNER_REPO\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_STATUS_UPDATE_HOOK\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_TOKEN\"}":{".":{},"f:name":{},"f:value":{}},"k:{\"name\":\"RUNNER_WORKDIR\"}":{".":{},"f:name":{},"f:value":{}}},"f:image":{},"f:imagePullPolicy":{},"f:name":{},"f:resources":{},"f:securityContext":{},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{},"f:volumeMounts":{".":{},"k:{\"mountPath\":\"/run\"}":{".":{},"f:mountPath":{},"f:name":{}},"k:{\"mountPath\":\"/runner\"}":{".":{},"f:mountPath":{},"f:name":{}},"k:{\"mountPath\":\"/runner/_work\"}":{".":{},"f:mountPath":{},"f:name":{}}}}},"f:dnsPolicy":{},"f:enableServiceLinks":{},"f:restartPolicy":{},"f:schedulerName":{},"f:securityContext":{},"f:terminationGracePeriodSeconds":{},"f:volumes":{".":{},"k:{\"name\":\"runner\"}":{".":{},"f:emptyDir":{},"f:name":{}},"k:{\"name\":\"var-run\"}":{".":{},"f:emptyDir":{".":{},"f:medium":{},"f:sizeLimit":{}},"f:name":{}},"k:{\"name\":\"work\"}":{".":{},"f:emptyDir":{},"f:name":{}}}}}}]},"spec":{"volumes":[{"name":"runner","emptyDir":{}},{"name":"work","emptyDir":{}},{"name":"var-run","emptyDir":{"medium":"Memory","sizeLimit":"1M"}},{"name":"kube-api-access-fzg9v","projected":{"sources":[{"serviceAccountToken":{"expirationSeconds":3607,"path":"token"}},{"configMap":{"name":"kube-root-ca.crt","items":[{"key":"ca.crt","path":"ca.crt"}]}},{"downwardAPI":{"items":[{"path":"namespace","fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}]}}],"defaultMode":420}}],"containers":[{"name":"runner","image":"summerwind/actions-runner:latest","env":[{"name":"RUNNER_ORG"},{"name":"RUNNER_REPO","value":"org/repo-provisioner"},{"name":"RUNNER_ENTERPRISE"},{"name":"RUNNER_LABELS"},{"name":"RUNNER_GROUP"},{"name":"DOCKER_ENABLED","value":"true"},{"name":"DOCKERD_IN_RUNNER","value":"false"},{"name":"GITHUB_URL","value":"https://github.com/"},{"name":"RUNNER_WORKDIR","value":"/runner/_work"},{"name":"RUNNER_EPHEMERAL","value":"true"},{"name":"RUNNER_STATUS_UPDATE_HOOK","value":"false"},{"name":"GITHUB_ACTIONS_RUNNER_EXTRA_USER_AGENT","value":"actions-runner-controller/v0.27.6"},{"name":"DOCKER_HOST","value":"unix:///run/docker.sock"},{"name":"RUNNER_NAME","value":"repo-provisioner-runnerdeploy-m9v2d-g4j94"},{"name":"RUNNER_TOKEN","value":"A4REXNEC4CXND7YL7XSNSXLFPRXRK"}],"resources":{},"volumeMounts":[{"name":"runner","mountPath":"/runner"},{"name":"work","mountPath":"/runner/_work"},{"name":"var-run","mountPath":"/run"},{"name":"kube-api-access-fzg9v","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"Always","securityContext":{}},{"name":"docker","image":"docker:dind","args":["dockerd","--host=unix:///run/docker.sock","--group=$(DOCKER_GROUP_GID)"],"env":[{"name":"DOCKER_GROUP_GID","value":"1001"}],"resources":{},"volumeMounts":[{"name":"runner","mountPath":"/runner"},{"name":"var-run","mountPath":"/run"},{"name":"work","mountPath":"/runner/_work"},{"name":"kube-api-access-fzg9v","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"}],"lifecycle":{"preStop":{"exec":{"command":["/bin/sh","-c","timeout \"${RUNNER_GRACEFUL_STOP_TIMEOUT:-15}\" /bin/sh -c \"echo 'Prestop hook started'; while [ -f /runner/.runner ]; do sleep 1; done; echo 'Waiting for dockerd to start'; while ! pgrep -x dockerd; do sleep 1; done; echo 'Prestop hook stopped'\" >/proc/1/fd/1 2>&1"]}}},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent","securityContext":{"privileged":true}}],"restartPolicy":"Never","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","serviceAccountName":"default","serviceAccount":"default","nodeName":"gke-xxx-platform-cor-xxx-platform-cor-b38cb942-1avk","securityContext":{},"schedulerName":"default-scheduler","tolerations":[{"key":"node.kubernetes.io/not-ready","operator":"Exists","effect":"NoExecute","tolerationSeconds":300},{"key":"node.kubernetes.io/unreachable","operator":"Exists","effect":"NoExecute","tolerationSeconds":300}],"priority":0,"enableServiceLinks":true,"preemptionPolicy":"PreemptLowerPriority"},"status":{"phase":"Pending","conditions":[{"type":"PodScheduled","status":"True","lastProbeTime":null,"lastTransitionTime":"2023-12-15T14:24:50Z"}],"qosClass":"BestEffort"}}]}
manager 2023-12-15T14:24:52Z    DEBUG    runner    Runner appears to have been registered and running.    {"runner": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-g4j94", "podCreationTimestamp": "2023-12-15 14:24:50 +0000 UTC"}
manager 2023-12-15T14:24:52Z    DEBUG    runner    Runner appears to have been registered and running.    {"runner": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-lmgwf", "podCreationTimestamp": "2023-12-15 14:24:50 +0000 UTC"}
manager 2023-12-15T14:25:00Z    DEBUG    runner    Runner appears to have been registered and running.    {"runner": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-g4j94", "podCreationTimestamp": "2023-12-15 14:24:50 +0000 UTC"}
manager 2023-12-15T14:25:00Z    DEBUG    runner    Runner appears to have been registered and running.    {"runner": "actions-runner-system/repo-provisioner-runnerdeploy-m9v2d-lmgwf", "podCreationTimestamp": "2023-12-15 14:24:50 +0000 UTC"}


### Whole Runner Pod Logs

```shell
runner 2023-12-15 14:27:01.747  NOTICE --- Runner init started with pid 7
runner 2023-12-15 14:27:01.754  DEBUG --- Github endpoint URL https://github.com/
runner 2023-12-15 14:27:03.905  DEBUG --- Passing --ephemeral to config.sh to enable the ephemeral runner.
runner 2023-12-15 14:27:03.911  DEBUG --- Configuring the runner.
runner 
runner --------------------------------------------------------------------------------
runner |        ____ _ _   _   _       _          _        _   _                      |
runner |       / ___(_) |_| | | |_   _| |__      / \   ___| |_(_) ___  _ __  ___      |
runner |      | |  _| | __| |_| | | | | '_ \    / _ \ / __| __| |/ _ \| '_ \/ __|     |
runner |      | |_| | | |_|  _  | |_| | |_) |  / ___ \ (__| |_| | (_) | | | \__ \     |
runner |       \____|_|\__|_| |_|\__,_|_.__/  /_/   \_\___|\__|_|\___/|_| |_|___/     |
runner |                                                                              |
runner |                       Self-hosted runner registration                        |
runner |                                                                              |
runner --------------------------------------------------------------------------------
runner 
runner # Authentication
runner 
docker time="2023-12-15T14:27:02.081724238Z" level=info msg="Starting up"
docker time="2023-12-15T14:27:02.083391769Z" level=info msg="containerd not running, starting managed containerd"
docker time="2023-12-15T14:27:02.084540732Z" level=info msg="started new containerd process" address=/var/run/docker/containerd/containerd.sock module=libcontainerd pid=235
docker time="2023-12-15T14:27:02.120060139Z" level=info msg="starting containerd" revision=091922f03c2762540fd057fba91260237ff86acb version=v1.7.6
docker time="2023-12-15T14:27:02.143777591Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.aufs\"..." type=io.containerd.snapshotter.v1
docker time="2023-12-15T14:27:02.151873971Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.aufs\"..." error="aufs is not supported (modprobe aufs failed: exit status 1 \"ip: can't find device 'aufs'\\nmodprobe: can't change directory to '/lib/modules': No such file or directory\\n\"): skip plugin" type=io.containerd.snapshotter.v1
docker time="2023-12-15T14:27:02.151989995Z" level=info msg="loading plugin \"io.containerd.content.v1.content\"..." type=io.containerd.content.v1
docker time="2023-12-15T14:27:02.175868736Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.blockfile\"..." type=io.containerd.snapshotter.v1
docker time="2023-12-15T14:27:02.176138830Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.blockfile\"..." error="no scratch file generator: skip plugin" type=io.containerd.snapshotter.v1
docker time="2023-12-15T14:27:02.176237639Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.native\"..." type=io.containerd.snapshotter.v1
docker time="2023-12-15T14:27:02.176403962Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.overlayfs\"..." type=io.containerd.snapshotter.v1
docker time="2023-12-15T14:27:02.176914548Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.devmapper\"..." type=io.containerd.snapshotter.v1
docker time="2023-12-15T14:27:02.176954597Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.devmapper" error="devmapper not configured"
docker time="2023-12-15T14:27:02.176976217Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.zfs\"..." type=io.containerd.snapshotter.v1
docker time="2023-12-15T14:27:02.177256697Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.zfs\"..." error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
docker time="2023-12-15T14:27:02.177287941Z" level=info msg="loading plugin \"io.containerd.metadata.v1.bolt\"..." type=io.containerd.metadata.v1
docker time="2023-12-15T14:27:02.177396892Z" level=warning msg="could not use snapshotter devmapper in metadata plugin" error="devmapper not configured"
docker time="2023-12-15T14:27:02.177429271Z" level=info msg="metadata content store policy set" policy=shared
docker time="2023-12-15T14:27:02.251871084Z" level=info msg="loading plugin \"io.containerd.differ.v1.walking\"..." type=io.containerd.differ.v1
docker time="2023-12-15T14:27:02.251947713Z" level=info msg="loading plugin \"io.containerd.event.v1.exchange\"..." type=io.containerd.event.v1
docker time="2023-12-15T14:27:02.251973152Z" level=info msg="loading plugin \"io.containerd.gc.v1.scheduler\"..." type=io.containerd.gc.v1
docker time="2023-12-15T14:27:02.252026227Z" level=info msg="loading plugin \"io.containerd.lease.v1.manager\"..." type=io.containerd.lease.v1
docker time="2023-12-15T14:27:02.252058649Z" level=info msg="loading plugin \"io.containerd.nri.v1.nri\"..." type=io.containerd.nri.v1
docker time="2023-12-15T14:27:02.252084958Z" level=info msg="NRI interface is disabled by configuration."
docker time="2023-12-15T14:27:02.252126180Z" level=info msg="loading plugin \"io.containerd.runtime.v2.task\"..." type=io.containerd.runtime.v2
docker time="2023-12-15T14:27:02.252336194Z" level=info msg="loading plugin \"io.containerd.runtime.v2.shim\"..." type=io.containerd.runtime.v2
docker time="2023-12-15T14:27:02.252371196Z" level=info msg="loading plugin \"io.containerd.sandbox.store.v1.local\"..." type=io.containerd.sandbox.store.v1
docker time="2023-12-15T14:27:02.252393164Z" level=info msg="loading plugin \"io.containerd.sandbox.controller.v1.local\"..." type=io.containerd.sandbox.controller.v1
docker time="2023-12-15T14:27:02.252818801Z" level=info msg="loading plugin \"io.containerd.streaming.v1.manager\"..." type=io.containerd.streaming.v1
docker time="2023-12-15T14:27:02.253929131Z" level=info msg="loading plugin \"io.containerd.service.v1.introspection-service\"..." type=io.containerd.service.v1
docker time="2023-12-15T14:27:02.254014521Z" level=info msg="loading plugin \"io.containerd.service.v1.containers-service\"..." type=io.containerd.service.v1
docker time="2023-12-15T14:27:02.254052433Z" level=info msg="loading plugin \"io.containerd.service.v1.content-service\"..." type=io.containerd.service.v1
docker time="2023-12-15T14:27:02.254114517Z" level=info msg="loading plugin \"io.containerd.service.v1.diff-service\"..." type=io.containerd.service.v1
docker time="2023-12-15T14:27:02.254152509Z" level=info msg="loading plugin \"io.containerd.service.v1.images-service\"..." type=io.containerd.service.v1
docker time="2023-12-15T14:27:02.254195032Z" level=info msg="loading plugin \"io.containerd.service.v1.namespaces-service\"..." type=io.containerd.service.v1
docker time="2023-12-15T14:27:02.254226021Z" level=info msg="loading plugin \"io.containerd.service.v1.snapshots-service\"..." type=io.containerd.service.v1
docker time="2023-12-15T14:27:02.254259467Z" level=info msg="loading plugin \"io.containerd.runtime.v1.linux\"..." type=io.containerd.runtime.v1
docker time="2023-12-15T14:27:02.255321156Z" level=info msg="loading plugin \"io.containerd.monitor.v1.cgroups\"..." type=io.containerd.monitor.v1
docker time="2023-12-15T14:27:02.255812275Z" level=info msg="loading plugin \"io.containerd.service.v1.tasks-service\"..." type=io.containerd.service.v1
docker time="2023-12-15T14:27:02.255874075Z" level=info msg="loading plugin \"io.containerd.grpc.v1.introspection\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.255894514Z" level=info msg="loading plugin \"io.containerd.transfer.v1.local\"..." type=io.containerd.transfer.v1
docker time="2023-12-15T14:27:02.255985241Z" level=info msg="loading plugin \"io.containerd.internal.v1.restart\"..." type=io.containerd.internal.v1
docker time="2023-12-15T14:27:02.256248150Z" level=info msg="loading plugin \"io.containerd.grpc.v1.containers\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.256323656Z" level=info msg="loading plugin \"io.containerd.grpc.v1.content\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.256422542Z" level=info msg="loading plugin \"io.containerd.grpc.v1.diff\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.256453718Z" level=info msg="loading plugin \"io.containerd.grpc.v1.events\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.256479212Z" level=info msg="loading plugin \"io.containerd.grpc.v1.healthcheck\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.256503565Z" level=info msg="loading plugin \"io.containerd.grpc.v1.images\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.256552185Z" level=info msg="loading plugin \"io.containerd.grpc.v1.leases\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.256582930Z" level=info msg="loading plugin \"io.containerd.grpc.v1.namespaces\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.256611979Z" level=info msg="loading plugin \"io.containerd.internal.v1.opt\"..." type=io.containerd.internal.v1
docker time="2023-12-15T14:27:02.256958395Z" level=info msg="loading plugin \"io.containerd.grpc.v1.sandbox-controllers\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.257018424Z" level=info msg="loading plugin \"io.containerd.grpc.v1.sandboxes\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.257046936Z" level=info msg="loading plugin \"io.containerd.grpc.v1.snapshots\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.257079026Z" level=info msg="loading plugin \"io.containerd.grpc.v1.streaming\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.257136031Z" level=info msg="loading plugin \"io.containerd.grpc.v1.tasks\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.257184740Z" level=info msg="loading plugin \"io.containerd.grpc.v1.transfer\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.257211528Z" level=info msg="loading plugin \"io.containerd.grpc.v1.version\"..." type=io.containerd.grpc.v1
docker time="2023-12-15T14:27:02.257242249Z" level=info msg="loading plugin \"io.containerd.tracing.processor.v1.otlp\"..." type=io.containerd.tracing.processor.v1
docker time="2023-12-15T14:27:02.257270729Z" level=info msg="skip loading plugin \"io.containerd.tracing.processor.v1.otlp\"..." error="no OpenTelemetry endpoint: skip plugin" type=io.containerd.tracing.processor.v1
docker time="2023-12-15T14:27:02.257299544Z" level=info msg="loading plugin \"io.containerd.internal.v1.tracing\"..." type=io.containerd.internal.v1
docker time="2023-12-15T14:27:02.257320819Z" level=info msg="skipping tracing processor initialization (no tracing plugin)" error="no OpenTelemetry endpoint: skip plugin"
docker time="2023-12-15T14:27:02.257786919Z" level=info msg=serving... address=/var/run/docker/containerd/containerd-debug.sock
docker time="2023-12-15T14:27:02.257872806Z" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock.ttrpc
docker time="2023-12-15T14:27:02.257949226Z" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock
docker time="2023-12-15T14:27:02.257980850Z" level=info msg="containerd successfully booted in 0.139331s"
runner 
runner √ Connected to GitHub
runner 
runner # Runner Registration
runner 
runner 
runner 
runner 
runner √ Runner successfully added
docker time="2023-12-15T14:27:09.178024096Z" level=info msg="Loading containers: start."
docker time="2023-12-15T14:27:09.254562479Z" level=info msg="stopping event stream following graceful shutdown" error="<nil>" module=libcontainerd namespace=moby
docker time="2023-12-15T14:27:09.255228112Z" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=plugins.moby
docker time="2023-12-15T14:27:09.255284261Z" level=info msg="stopping healthcheck following graceful shutdown" module=libcontainerd
runner √ Runner connection is good
runner 
runner # Runner settings
runner 
runner 
runner √ Settings Saved.
runner 
runner 2023-12-15 14:27:09.656  DEBUG --- Runner successfully configured.
runner {
runner   "agentId": 272,
runner   "agentName": "<<redacted>>-runnerdeploy-m9v2d-6lgb9",
runner   "poolId": 1,
runner   "poolName": "Default",
runner   "ephemeral": true,
runner   "serverUrl": "<redacted>",
runner   "gitHubUrl": "<redacted>",
runner   "workFolder": "/runner/_work"
runner 2023-12-15 14:27:09.665  DEBUG --- Docker enabled runner detected and Docker daemon wait is enabled
runner 2023-12-15 14:27:09.667  DEBUG --- Waiting until Docker is available or the timeout of 120 seconds is reached
docker failed to start daemon: Error initializing network controller: error obtaining controller instance: unable to add return rule in DOCKER-ISOLATION-STAGE-1 chain:  (iptables failed: iptables --wait -A DOCKER-ISOLATION-STAGE-1 -j RETURN: iptables v1.8.10 (nf_tables):  RULE_APPEND failed (No such file or directory): rule in chain DOCKER-ISOLATION-STAGE-1
docker  (exit status 4))
runner Cannot connect to the Docker daemon at unix:///run/docker.sock. Is the docker daemon running?
Stream closed EOF for actions-runner-system/<<redacted>>-runnerdeploy-m9v2d-6lgb9 (docker)
runner Cannot connect to the Docker daemon at unix:///run/docker.sock. Is the docker daemon running?



### Additional Context

_No response_
@iamcaleberic iamcaleberic added bug Something isn't working community Community contribution needs triage Requires review from the maintainers labels Dec 15, 2023
@iamcaleberic
Copy link
Author

iamcaleberic commented Dec 15, 2023

Updating and pinning image.dindSidecarRepositoryAndTag at docker:24.0.7-dind-alpine3.18 appears to resolve it

@sbalasu27
Copy link

@iamcaleberic yes, we had the same issue and the workaround works 👍

@billimek
Copy link
Contributor

billimek commented Dec 15, 2023

Seeing the same issue here running in GKE. We're also dealing with a problem where this morning we ended-up with 10,000 runners (triggering secondary rate limiting) and the vast majority of them were 'offline'.

Is there any chance that there is a relationship between this and runners being left in an offline state as they fail to come online cleanly and ARC controller (v0.26.0) not properly de-registering them from GitHub?

EDIT/UPDATE: After we implemented the fix to pin the docker sidecar to docker:24.0.7-dind-alpine3.18 we no longer saw the issue with the building 'offline' runners and believe that the two are related.

@joshgc
Copy link

joshgc commented Dec 15, 2023

Sorry for the naive question but where are you specifying image.dindSidecarRepositoryAndTag I'm not seeing any mention of that in the actions-runner-controller.yaml. Is this perhaps a Helm thing? Surely it has a kubectl/yaml-only representation too? Thank you for the great tips.

@sergiopsyalo
Copy link

We are experimenting the same issue

@joshgc
Copy link

joshgc commented Dec 16, 2023

@verult was able to patch the command directly in like this on line 34342 in version 0.26 of actions-runner-controller.yaml

      containers:
      - args:	      - args:
        - --metrics-addr=127.0.0.1:8080	        - --metrics-addr=127.0.0.1:8080
        - --enable-leader-election	        - --enable-leader-election
        # Temporary workaround for https://github.com/actions/actions-runner-controller/issues/3159
        - --docker-image=docker:24.0.7-dind-alpine3.18
        command:	        command:
        - /manager	        - /manager
        env:	        env:

@LaloLoop
Copy link

Thanks for the suggestions, everyone.
We got our runners working again, but the pods won't terminate. We use ephemeral runners, and the docker issue impacted us today. The runners couldn't start, and we reached a Runner Group 10k limit. Once the runners started again, they were not cleaned up and stayed in a Terminating phase. We're still trying to figure out why. We tested various versions of both the chart and app versions, but at least running v0.26.0 with --docker-image=docker:24.0.7-dind-alpine3.18 resulted in pods lingering.
We're also using a custom runner image, which could be an issue. We'll keep on investigating.

@verult
Copy link

verult commented Dec 16, 2023

@LaloLoop we ran into the issue of pods getting stuck in the Terminating phase after we deleted the runner controller, because there were finalizers left on these pods. Is your controller running when your pods are stuck?

@LaloLoop
Copy link

LaloLoop commented Dec 16, 2023

Thanks for pointing that out @verult . We reached the rate limit as described by @billimek . That caused the controller to panic continously and fail to reconcile. We're using 0.26.0, not sur if newer versions have better error/retries handling.
I guess we're gonna have to wait to have it reset before trying anything. Changing anything in our runners at the moment triggers the rate limit and everything fails, even if we don't use pulling for the auto scalers.

@hariapollo
Copy link

@joshgc you can find in here https://github.com/actions/actions-runner-controller/blob/master/charts/actions-runner-controller/values.yaml#L55

Thanks @iamcaleberic it did magic and worked for us as well.

@sylvain-actual
Copy link

Same error for me.
Fixed with : dindSidecarRepositoryAndTag: "docker:24.0.7-dind-alpine3.18"

@brconnell4
Copy link

For those running auto scaling runner set, I tried to update the template.spec.containers.dind to 24.0.7-dind-alpine3.18 and it didn't work. It retained the value of docker:dind. I know my syntax is correct because I also pin our custom image to containers as well.

I manually updated the CRD autoscalingrunnerset to docker:24.0.7-dind-alpine3.18 and this seems to work as well.

My question is, why is this not pinned to a stable version instead of "latest"? It exposes us to unstable updates that can lead to downtime or interruption.

@billimek
Copy link
Contributor

If this is still an issue for some folks and you are still dealing with ~10,000 offline runners which is triggering the secondary rate-limiting, the following script snippet may be useful to remove the offline runners,

#!/bin/bash

while true; do
    echo "Fetching more runners"
    RESPONSE=$(gh api \
    -H "Accept: application/vnd.github+json" \
    -H "X-GitHub-Api-Version: 2022-11-28" \
    /orgs/<YOUR ORG>/actions/runners)

    echo "Total runners: $(echo "$RESPONSE" | jq '.total_count')"
    OFFLINE_RUNNERS="$(echo "$RESPONSE" | jq '.runners | map(select(.status == "offline"))')"

    RUNNERS="$(echo "$OFFLINE_RUNNERS" | jq '.[].id') "

    # Loop for each runner
    for RUNNER in $RUNNERS; do
    echo "Removing runner: $RUNNER"
    gh api \
        -H "Accept: application/vnd.github+json" \
        -H "X-GitHub-Api-Version: 2022-11-28" \
        -X DELETE \
        "/orgs/<YOUR ORG>/actions/runners/$RUNNER" >> removal.logs
    done
    # If there was no runners, break
    if [ -z "$RUNNERS" ]; then
        echo "Done!"
        break
    fi
done

... or the following action may accomplish the same thing as well (just don't run it on self hosted runners where you are experiencing this issue!): some-natalie/runner-reaper.

It's my understanding that GitHub should automatically remove offline runners after 24h but the symptom of this issue seems to be that it will very quickly ramp up the number of offline runners making that automation not viable unless or until you correct the pinned docker version.

It also looks like the upstream docker:dind image was corrected so your system may self correct over some time anyway.

@billimek
Copy link
Contributor

As @iamcaleberic pointed out, if you're deploying the actions-runner-controller helm chart, the relevant values line to override when re-deploying a fix to the chart is located here

If you're running the newer gha-runner-scale-set chart and it's exhibiting the same issue (we don't currently run this one so it's unclear if the scale set is affected or not), it looks like the modification necessary is going to be related to the template spec definition here.

damccorm pushed a commit to apache/beam that referenced this issue Dec 18, 2023
@romanvogman
Copy link

running the newer gha-runner-scale-set, and overriding the spec in the values.yaml with a new docker tag doesn't seem to make a difference, it stays on docker:dind.

anyone managed to find a workaround for it?

@brconnell4
Copy link

running the newer gha-runner-scale-set, and overriding the spec in the values.yaml with a new docker tag doesn't seem to make a difference, it stays on docker:dind.

anyone managed to find a workaround for it?

Update the CRD manually under autoscalingrunnerset and patch it.

@jamezrin
Copy link

running the newer gha-runner-scale-set, and overriding the spec in the values.yaml with a new docker tag doesn't seem to make a difference, it stays on docker:dind.

anyone managed to find a workaround for it?

This worked for me:
https://github.com/jamezrin/personal-actions-runner-setup/blob/main/gha-runner-scale-set-dind-fix.yaml#L24C53-L24C117

@farrukh90
Copy link

Updating and pinning image.dindSidecarRepositoryAndTag at docker:24.0.7-dind-alpine3.18 appears to resolve it

So, how do we do that? I have the following file below and I dont know where to add it.

apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
name: example-runnerdeploy
namespace: actions-runner-system
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
labels:
name: example-runnerdeploy
spec:
replicas: 1
template:
spec:
repository: farrukh90/symmetrical-fortnight
image: farrukhsadykov/runner:latest
labels:
- example-runnerdeploy

apiVersion: actions.summerwind.dev/v1alpha1
kind: HorizontalRunnerAutoscaler
metadata:
name: example-runnerdeploy
namespace: actions-runner-system
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
labels:
name: example-runnerdeploy
spec:
scaleTargetRef:
name: example-runnerdeploy
scaleDownDelaySecondsAfterScaleOut: 300
minReplicas: 2
maxReplicas: 20
metrics:

  • type: TotalNumberOfQueuedAndInProgressWorkflowRuns
    repositoryNames:
    • xxxxxx/symmetrical-fortnight

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: example-runnerdeploy
namespace: actions-runner-system
spec:
minAvailable: 1
selector:
matchLabels:
app: example-runnerdeploy


apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: for-aws-tasks
parameters:
type: pd-standard
provisioner: kubernetes.io/gce-pd
reclaimPolicy: Retain
volumeBindingMode: Immediate
allowVolumeExpansion: false

@volatilemolotov
Copy link

image.dindSidecarRepositoryAndTag

is done on helm level

@kopax-polyconseil
Copy link

Is this fixed now or should we stick to the binded version of the image ?

@romanvogman
Copy link

romanvogman commented Dec 26, 2023

running the newer gha-runner-scale-set, and overriding the spec in the values.yaml with a new docker tag doesn't seem to make a difference, it stays on docker:dind.
anyone managed to find a workaround for it?

This worked for me: https://github.com/jamezrin/personal-actions-runner-setup/blob/main/gha-runner-scale-set-dind-fix.yaml#L24C53-L24C117

For the gha scale set, I've ended up leaving container mode to be empty and updated template to include the specs to be the same as once created when container mode is dind, only with the new docker tag.
Found it to be a better solution for me to continue using the help chart I've already had in hope there will be a fix that supports dind image tag from values.yaml

@Jmainguy
Copy link

Jmainguy commented Jan 23, 2024

Fix has been implemented upstream in docker:dind, however it now requires this helm-chart / us using to set a new variable.

docker-library/docker#468 (comment)

image

set DOCKER_IPTABLES_LEGACY=1 inside your dind pod, via an overwrite to the helm chart default variables (this should get added to the helm chart, if someone wants an easy PR)

Change should go right after these lines for the PR to the chart if someone had a minute to open it. https://github.com/actions/actions-runner-controller/blob/master/charts/gha-runner-scale-set/templates/_helpers.tpl#L106 and https://github.com/actions/actions-runner-controller/blob/master/charts/gha-runner-scale-set/values.yaml#L142

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working community Community contribution needs triage Requires review from the maintainers
Projects
None yet
Development

No branches or pull requests