Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CNI STATUS Verb #123

Merged
merged 1 commit into from
Dec 6, 2024
Merged

Conversation

MikeZappa87
Copy link
Contributor

@MikeZappa87 MikeZappa87 commented Nov 18, 2024

We are implementing the CNI Status verb. The Status verb is to provide the container runtime the ability to determine if the runtime should call CNI ADD.

Please merge #122 first

Reference:
ocicni (cri-o/ocicni#196)

@MikeZappa87 MikeZappa87 force-pushed the issue/supportstatus branch 2 times, most recently from b47ffec to 8f10978 Compare November 18, 2024 19:32
@MikeZappa87 MikeZappa87 requested a review from mikebrow November 22, 2024 17:27
@MikeZappa87 MikeZappa87 marked this pull request as ready for review November 23, 2024 02:39
@MikeZappa87 MikeZappa87 marked this pull request as draft November 24, 2024 01:21
@MikeZappa87 MikeZappa87 marked this pull request as ready for review November 24, 2024 02:34
@mikebrow
Copy link
Member

We are implementing the CNI Status verb. The Status verb is to provide the container runtime the ability to determine if the runtime should call CNI ADD.

Please merge #122 first

Reference: ocicni (cri-o/ocicni#196)

122 merged rebase pls ..

@MikeZappa87
Copy link
Contributor Author

We are implementing the CNI Status verb. The Status verb is to provide the container runtime the ability to determine if the runtime should call CNI ADD.
Please merge #122 first
Reference: ocicni (cri-o/ocicni#196)

122 merged rebase pls ..

Will do once I get home

cni.go Show resolved Hide resolved
cni.go Show resolved Hide resolved
cni.go Show resolved Hide resolved
cni.go Outdated Show resolved Hide resolved
Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment..

cni.go Show resolved Hide resolved
@MikeZappa87 MikeZappa87 force-pushed the issue/supportstatus branch 2 times, most recently from 5762dbb to 37d969a Compare November 26, 2024 01:27
Signed-off-by: Michael Zappa <[email protected]>
@MikeZappa87
Copy link
Contributor Author

@squeed over here

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mikebrow mikebrow merged commit 642f1ce into containerd:main Dec 6, 2024
5 checks passed
@architkulkarni architkulkarni mentioned this pull request Dec 9, 2024
@dmcgowan dmcgowan added impact/changelog area/cri Container Runtime Interface (CRI) labels Dec 13, 2024
Mengkzhaoyun pushed a commit to open-beagle/containerd that referenced this pull request Dec 19, 2024
containerd 2.0.1

Welcome to the v2.0.1 release of containerd!

The first patch release for containerd 2.0 includes a number of bug fixes and improvements.

* Fix apply IoOwner options when not in user namespace ([#11151](containerd/containerd#11151))
* Fix cri grpc plugin config migration ([#11140](containerd/containerd#11140))
* Support CNI STATUS Verb ([containerd/go-cni#123](containerd/go-cni#123))

* Update differ to handle zstd media types ([#11068](containerd/containerd#11068))

* Update runc binary to v1.2.3 ([#11142](containerd/containerd#11142))
* Fix panic due to nil dereference cgroups v2 ([#11098](containerd/containerd#11098))

Please try out the release binaries and report any issues at
https://github.com/containerd/containerd/issues.

* Derek McGowan
* Wei Fu
* Archit Kulkarni
* Jin Dong
* Phil Estes
* Akhil Mohan
* Akihiro Suda
* Alexey Lunev
* Austin Vazquez
* Maksym Pavlenko
* Mike Brown
* Michael Zappa
* Samuel Karp
* Sebastiaan van Stijn
* Andrey Smirnov
* Davanum Srinivas
<details><summary>50 commits</summary>
<p>

* Prepare release notes for v2.0.1 ([#11158](containerd/containerd#11158))
  * [`b0ece5dc5`](containerd/containerd@b0ece5d) Prepare release notes for v2.0.1
* build(deps): bump actions/attest-build-provenance from 1.4.4 to 2.1.0 ([#11154](containerd/containerd#11154))
  * [`fe6957084`](containerd/containerd@fe69570) build(deps): bump actions/attest-build-provenance from 1.4.4 to 2.1.0
* update xx to v1.6.1 for compatibility with alpine 3.21 and file 5.46+ ([#11153](containerd/containerd#11153))
  * [`eb2ce6882`](containerd/containerd@eb2ce68) update xx to v1.6.1 for compatibility with alpine 3.21 and file 5.46+
* ctr pull should unpack for default platform when transfer service is used ([#11139](containerd/containerd#11139))
  * [`44cdca68b`](containerd/containerd@44cdca6) ctr pull unpack for default platform using transfer service
* Fix apply IoOwner options when not in user namespace ([#11151](containerd/containerd#11151))
  * [`018d83650`](containerd/containerd@018d836) internal/cri: should not apply IoOwner options
* Update go-cni for CNI STATUS ([#11146](containerd/containerd#11146))
  * [`5eb7995a9`](containerd/containerd@5eb7995) feat: update go-cni version for CNI STATUS
* Fix cri grpc plugin config migration ([#11140](containerd/containerd#11140))
  * [`a2302ea89`](containerd/containerd@a2302ea) Add integration test for custom configuration
  * [`be5eda069`](containerd/containerd@be5eda0) complete cri grpc config migration
* Update runc binary to v1.2.3 ([#11142](containerd/containerd#11142))
  * [`a53eff53d`](containerd/containerd@a53eff5) update runc binary to v1.2.3
* Update differ to handle zstd media types ([#11068](containerd/containerd#11068))
  * [`73f57acb0`](containerd/containerd@73f57ac) Update differ to handle zstd media types
* update to go1.23.4 / go1.22.10 ([#11109](containerd/containerd#11109))
  * [`290e8bc70`](containerd/containerd@290e8bc) update to go1.23.4 / go1.22.10
* CI: update Fedora to 41 ([#11110](containerd/containerd#11110))
  * [`62b790bfa`](containerd/containerd@62b790b) CI: update Fedora to 41
* Fix panic due to nil dereference cgroups v2 ([#11098](containerd/containerd#11098))
  * [`3ba2df924`](containerd/containerd@3ba2df9) fix panic due to nil dereference cgroups v2
* Publish attestation as release artifact ([#11067](containerd/containerd#11067))
  * [`34a45cab2`](containerd/containerd@34a45ca) Publish attestation as release artifact
* Move rockylinux 9.4 to almalinux/9 in CI ([#11053](containerd/containerd#11053))
  * [`7dec6b460`](containerd/containerd@7dec6b4) move rocky 9.4 to almalinux/9 in CI
* *: should align pipe's owner with init process ([#11035](containerd/containerd#11035))
  * [`cf07f28ee`](containerd/containerd@cf07f28) *: should align pipe's owner with init process
* fix: set the credentials even if not provided ([#11031](containerd/containerd#11031))
  * [`986088866`](containerd/containerd@9860888) fix: set the credentials even if not provided
* fsverity_test.go: fix nil pointer derefence, fix test fail, fix minor/major device numbers resolving ([#10978](containerd/containerd#10978))
  * [`30b929ece`](containerd/containerd@30b929e) fsverity_test.go: fix major/minor device number resolving
  * [`10996a334`](containerd/containerd@10996a3) fsverity_test.go: fix nil pointer dereference, fix test fail
* update runc binary to 1.2.2 ([#11023](containerd/containerd#11023))
  * [`9081e979f`](containerd/containerd@9081e97) update runc binary to 1.2.2
* Revert "Disable vagrant strict dependency checking" ([#11009](containerd/containerd#11009))
  * [`6399c936f`](containerd/containerd@6399c93) Revert "Disable vagrant strict dependency checking"
* fsverity_linux.go: Fix fsverity.IsEnabled() for big endian systems ([#11005](containerd/containerd#11005))
  * [`a7f2b562f`](containerd/containerd@a7f2b56) fsverity_linux.go: Fix fsverity.IsEnabled() for big endian systems
* bump github.com/containerd/typeurl/v2 from 2.2.2 to 2.2.3 ([#10997](containerd/containerd#10997))
  * [`389e781ea`](containerd/containerd@389e781) build(deps): bump github.com/containerd/typeurl/v2 from 2.2.2 to 2.2.3
* update to go1.23.3 / go1.22.9 ([#10973](containerd/containerd#10973))
  * [`5b879f30c`](containerd/containerd@5b879f3) update to go1.23.3 / go1.22.9
* ci: enable marking 2.0 releases as latest ([#10963](containerd/containerd#10963))
  * [`458215f6c`](containerd/containerd@458215f) ci: enable marking 2.0 releases as latest
* Avoid arch info in the sed/replace when building cri-cni-containerd.tar.gz ([#10968](containerd/containerd#10968))
  * [`e99c2b55c`](containerd/containerd@e99c2b5) Avoid arch info in the sed/replace when building cri-cni-containerd.tar.gz
</p>
</details>
<details><summary>7 commits</summary>
<p>

* Support CNI STATUS Verb ([containerd/go-cni#123](containerd/go-cni#123))
  * [`208eca9`](containerd/go-cni@208eca9) support CNI status verb
* Bump github actions dependencies to match containerd CI repo and fix lint ([containerd/go-cni#122](containerd/go-cni#122))
  * [`386f475`](containerd/go-cni@386f475) Fix ci.yml indent
  * [`a9b0675`](containerd/go-cni@a9b0675) Another doc commit to trigger lint?
  * [`14af454`](containerd/go-cni@14af454) Bump github actions dependency versions
  * [`9e0d096`](containerd/go-cni@9e0d096) Trivial doc commit to trigger lint
</p>
</details>

* **github.com/containerd/go-cni**      v1.1.10 -> v1.1.11
* **github.com/containerd/typeurl/v2**  v2.2.2 -> v2.2.3

Previous release can be found at [v2.0.0](https://github.com/containerd/containerd/releases/tag/v2.0.0)
* `containerd-<VERSION>-<OS>-<ARCH>.tar.gz`:         ✅Recommended. Dynamically linked with glibc 2.31 (Ubuntu 20.04).
* `containerd-static-<VERSION>-<OS>-<ARCH>.tar.gz`:  Statically linked. Expected to be used on non-glibc Linux distributions. Not position-independent.

In addition to containerd, typically you will have to install [runc](https://github.com/opencontainers/runc/releases)
and [CNI plugins](https://github.com/containernetworking/plugins/releases) from their official sites too.

See also the [Getting Started](https://github.com/containerd/containerd/blob/main/docs/getting-started.md) documentation.
@buroa
Copy link

buroa commented Dec 19, 2024

Hey @mikebrow (cc @MikeZappa87), this seems to cause issues with dual-CNI setups. I am using both Cilium + Multus, and during a node reboot; the node reports Ready and then shortly after goes NotReady. I was not seeing this issue with containerd=2.0.0. It's also random, like 50/50.

Kubelet just spams these logs when it happens:

m0: {"ts":1734619253013.3718,"caller":"kubelet/kubelet.go:2412","msg":"Skipping pod synchronization","err":"container runtime is down","errCauses":[{"error":"container runtime is down"}]}
m0: {"ts":1734619256213.9617,"caller":"kubelet/kubelet.go:2412","msg":"Skipping pod synchronization","err":"container runtime is down","errCauses":[{"error":"container runtime is down"}]}
m0: {"ts":1734619259270.4944,"caller":"nodestatus/setters.go:602","msg":"Node became not ready","v":0,"node":{"name":"m0"},"condition":{"type":"Ready","status":"False","lastHeartbeatTime":"2024-12-19T14:40:59Z","lastTransitionTime":"2024-12-19T14:40:59Z","reason":"KubeletNotReady","message":"container runtime is down"}}

@buroa
Copy link

buroa commented Dec 19, 2024

@MikeZappa87 Yep.

@MikeZappa87
Copy link
Contributor Author

MikeZappa87 commented Dec 19, 2024

I thought multus rewrote the cni conf file? The provided doesn’t show that.

In the cilium config map what is the value of

cni-exclusive

this might be more of a cilium issue

@mikebrow
Copy link
Member

The config.toml is probably messed up for your 2.0.1 containerd.. I'm not seeing the CRI plugin start.. It should look more like this:

root@ubnt:~# containerd -l debug
INFO[2024-12-19T12:10:45.293432341-06:00] starting containerd                           revision=d9a58a892b77f292b842b849f059d8d8e8972b4a version=v2.0.0-rc.1-36-gd9a58a892
INFO[2024-12-19T12:10:45.303370039-06:00] loading plugin                                id=io.containerd.image-verifier.v1.bindir type=io.containerd.image-verifier.v1
INFO[2024-12-19T12:10:45.303407613-06:00] loading plugin                                id=io.containerd.internal.v1.opt type=io.containerd.internal.v1
INFO[2024-12-19T12:10:45.303426969-06:00] loading plugin                                id=io.containerd.warning.v1.deprecations type=io.containerd.warning.v1
INFO[2024-12-19T12:10:45.303432886-06:00] loading plugin                                id=io.containerd.event.v1.exchange type=io.containerd.event.v1
INFO[2024-12-19T12:10:45.303443454-06:00] loading plugin                                id=io.containerd.monitor.task.v1.cgroups type=io.containerd.monitor.task.v1
INFO[2024-12-19T12:10:45.303654744-06:00] loading plugin                                id=io.containerd.content.v1.content type=io.containerd.content.v1
INFO[2024-12-19T12:10:45.303690135-06:00] loading plugin                                id=io.containerd.snapshotter.v1.blockfile type=io.containerd.snapshotter.v1
INFO[2024-12-19T12:10:45.303716009-06:00] skip loading plugin                           error="no scratch file generator: skip plugin" id=io.containerd.snapshotter.v1.blockfile type=io.containerd.snapshotter.v1
INFO[2024-12-19T12:10:45.303887261-06:00] loading plugin                                id=io.containerd.snapshotter.v1.btrfs type=io.containerd.snapshotter.v1
INFO[2024-12-19T12:10:45.304105633-06:00] skip loading plugin                           error="path /var/lib/containerd/io.containerd.snapshotter.v1.btrfs (ext4) must be a btrfs filesystem to be used with the btrfs snapshotter: skip plugin" id=io.containerd.snapshotter.v1.btrfs type=io.containerd.snapshotter.v1
INFO[2024-12-19T12:10:45.304120786-06:00] loading plugin                                id=io.containerd.snapshotter.v1.devmapper type=io.containerd.snapshotter.v1
INFO[2024-12-19T12:10:45.304146023-06:00] skip loading plugin                           error="devmapper not configured: skip plugin" id=io.containerd.snapshotter.v1.devmapper type=io.containerd.snapshotter.v1
INFO[2024-12-19T12:10:45.304152474-06:00] loading plugin                                id=io.containerd.snapshotter.v1.native type=io.containerd.snapshotter.v1
INFO[2024-12-19T12:10:45.304166490-06:00] loading plugin                                id=io.containerd.snapshotter.v1.overlayfs type=io.containerd.snapshotter.v1
INFO[2024-12-19T12:10:45.304212491-06:00] loading plugin                                id=io.containerd.metadata.v1.bolt type=io.containerd.metadata.v1
INFO[2024-12-19T12:10:45.304222501-06:00] metadata content store policy set             policy=shared
INFO[2024-12-19T12:10:45.313239232-06:00] loading plugin                                id=io.containerd.gc.v1.scheduler type=io.containerd.gc.v1
INFO[2024-12-19T12:10:45.313323425-06:00] loading plugin                                id=io.containerd.shim.v1.shim type=io.containerd.shim.v1
INFO[2024-12-19T12:10:45.313354616-06:00] loading plugin                                id=io.containerd.runtime.v2.task type=io.containerd.runtime.v2
INFO[2024-12-19T12:10:45.313397266-06:00] loading plugin                                id=io.containerd.differ.v1.walking type=io.containerd.differ.v1
INFO[2024-12-19T12:10:45.313414312-06:00] loading plugin                                id=io.containerd.lease.v1.manager type=io.containerd.lease.v1
INFO[2024-12-19T12:10:45.313432334-06:00] loading plugin                                id=io.containerd.sandbox.controller.v1.shim type=io.containerd.sandbox.controller.v1
INFO[2024-12-19T12:10:45.313901386-06:00] loading plugin                                id=io.containerd.service.v1.containers-service type=io.containerd.service.v1
INFO[2024-12-19T12:10:45.313922296-06:00] loading plugin                                id=io.containerd.service.v1.content-service type=io.containerd.service.v1
INFO[2024-12-19T12:10:45.313938676-06:00] loading plugin                                id=io.containerd.service.v1.diff-service type=io.containerd.service.v1
INFO[2024-12-19T12:10:45.313984695-06:00] loading plugin                                id=io.containerd.service.v1.images-service type=io.containerd.service.v1
INFO[2024-12-19T12:10:45.314005197-06:00] loading plugin                                id=io.containerd.service.v1.introspection-service type=io.containerd.service.v1
INFO[2024-12-19T12:10:45.314020957-06:00] loading plugin                                id=io.containerd.service.v1.namespaces-service type=io.containerd.service.v1
INFO[2024-12-19T12:10:45.314036685-06:00] loading plugin                                id=io.containerd.service.v1.snapshots-service type=io.containerd.service.v1
INFO[2024-12-19T12:10:45.314051508-06:00] loading plugin                                id=io.containerd.service.v1.tasks-service type=io.containerd.service.v1
DEBU[2024-12-19T12:10:45.314085800-06:00] No blockio config file specified, blockio not configured 
DEBU[2024-12-19T12:10:45.314095023-06:00] No RDT config file specified, RDT not configured 
INFO[2024-12-19T12:10:45.314106423-06:00] loading plugin                                id=io.containerd.grpc.v1.containers type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.314124183-06:00] loading plugin                                id=io.containerd.grpc.v1.content type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.314139577-06:00] loading plugin                                id=io.containerd.grpc.v1.diff type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.314155355-06:00] loading plugin                                id=io.containerd.grpc.v1.events type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.314169742-06:00] loading plugin                                id=io.containerd.grpc.v1.images type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.314191833-06:00] loading plugin                                id=io.containerd.grpc.v1.introspection type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.314206658-06:00] loading plugin                                id=io.containerd.grpc.v1.leases type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.314222130-06:00] loading plugin                                id=io.containerd.grpc.v1.namespaces type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.314237793-06:00] loading plugin                                id=io.containerd.sandbox.store.v1.local type=io.containerd.sandbox.store.v1
INFO[2024-12-19T12:10:45.314257274-06:00] loading plugin                                id=io.containerd.cri.v1.images type=io.containerd.cri.v1
WARN[2024-12-19T12:10:45.314360694-06:00] Ignoring unknown key in TOML for plugin       error="strict mode: fields in the document are missing in the target struct" key=PinnedImages plugin=io.containerd.cri.v1.images
INFO[2024-12-19T12:10:45.314584914-06:00] Get image filesystem path "/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs" for snapshotter "overlayfs" 
INFO[2024-12-19T12:10:45.314599583-06:00] Start snapshots syncer                       
INFO[2024-12-19T12:10:45.314824847-06:00] loading plugin                                id=io.containerd.cri.v1.runtime type=io.containerd.cri.v1
INFO[2024-12-19T12:10:45.315374008-06:00] starting cri plugin                           config="{\"containerd\":{\"defaultRuntimeName\":\"runc\",\"runtimes\":{\"runc\":{\"runtimeType\":\"io.containerd.runc.v2\",\"runtimePath\":\"\",\"PodAnnotations\":[],\"ContainerAnnotations\":[],\"options\":{\"BinaryName\":\"\",\"CriuImagePath\":\"\",\"CriuWorkPath\":\"\",\"IoGid\":0,\"IoUid\":0,\"NoNewKeyring\":false,\"Root\":\"\",\"ShimCgroup\":\"\"},\"privileged_without_host_devices\":false,\"privileged_without_host_devices_all_devices_allowed\":false,\"baseRuntimeSpec\":\"\",\"cniConfDir\":\"\",\"cniMaxConfNum\":0,\"snapshotter\":\"\",\"sandboxer\":\"podsandbox\"}},\"ignoreBlockIONotEnabledErrors\":false,\"ignoreRdtNotEnabledErrors\":false},\"cni\":{\"binDir\":\"/opt/cni/bin\",\"confDir\":\"/etc/cni/net.d\",\"maxConfNum\":1,\"setupSerially\":false,\"confTemplate\":\"\",\"ipPref\":\"\"},\"enableSelinux\":false,\"selinuxCategoryRange\":1024,\"maxContainerLogSize\":16384,\"disableCgroup\":false,\"disableApparmor\":false,\"restrictOOMScoreAdj\":false,\"disableProcMount\":false,\"unsetSeccompProfile\":\"\",\"tolerateMissingHugetlbController\":true,\"disableHugetlbController\":true,\"device_ownership_from_security_context\":false,\"ignoreImageDefinedVolumes\":false,\"netnsMountsUnderStateDir\":false,\"enableUnprivilegedPorts\":true,\"enableUnprivilegedICMP\":true,\"enableCDI\":true,\"cdiSpecDirs\":[\"/etc/cdi\",\"/var/run/cdi\"],\"drainExecSyncIOTimeout\":\"0s\",\"ignoreDeprecationWarnings\":null,\"containerdRootDir\":\"/var/lib/containerd\",\"containerdEndpoint\":\"/run/containerd/containerd.sock\",\"rootDir\":\"/var/lib/containerd/io.containerd.grpc.v1.cri\",\"stateDir\":\"/run/containerd/io.containerd.grpc.v1.cri\"}"
INFO[2024-12-19T12:10:45.315445198-06:00] loading plugin                                id=io.containerd.sandbox.controller.v1.podsandbox type=io.containerd.sandbox.controller.v1
INFO[2024-12-19T12:10:45.315723885-06:00] loading plugin                                id=io.containerd.grpc.v1.sandbox-controllers type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.315748116-06:00] loading plugin                                id=io.containerd.grpc.v1.sandboxes type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.315766307-06:00] loading plugin                                id=io.containerd.grpc.v1.snapshots type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.315781243-06:00] loading plugin                                id=io.containerd.streaming.v1.manager type=io.containerd.streaming.v1
INFO[2024-12-19T12:10:45.315802289-06:00] loading plugin                                id=io.containerd.grpc.v1.streaming type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.315819263-06:00] loading plugin                                id=io.containerd.grpc.v1.tasks type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.315834860-06:00] loading plugin                                id=io.containerd.transfer.v1.local type=io.containerd.transfer.v1
INFO[2024-12-19T12:10:45.315884969-06:00] loading plugin                                id=io.containerd.grpc.v1.transfer type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.315901697-06:00] loading plugin                                id=io.containerd.grpc.v1.version type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.315917393-06:00] loading plugin                                id=io.containerd.monitor.container.v1.restart type=io.containerd.monitor.container.v1
INFO[2024-12-19T12:10:45.316288305-06:00] loading plugin                                id=io.containerd.tracing.processor.v1.otlp type=io.containerd.tracing.processor.v1
INFO[2024-12-19T12:10:45.316326547-06:00] skip loading plugin                           error="skip plugin: tracing endpoint not configured" id=io.containerd.tracing.processor.v1.otlp type=io.containerd.tracing.processor.v1
INFO[2024-12-19T12:10:45.316342044-06:00] loading plugin                                id=io.containerd.internal.v1.tracing type=io.containerd.internal.v1
INFO[2024-12-19T12:10:45.316395219-06:00] skip loading plugin                           error="skip plugin: tracing endpoint not configured" id=io.containerd.internal.v1.tracing type=io.containerd.internal.v1
INFO[2024-12-19T12:10:45.316409121-06:00] loading plugin                                id=io.containerd.grpc.v1.healthcheck type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.316429941-06:00] loading plugin                                id=io.containerd.nri.v1.nri type=io.containerd.nri.v1
INFO[2024-12-19T12:10:45.316637619-06:00] runtime interface created                    
INFO[2024-12-19T12:10:45.316646499-06:00] created NRI interface                        
INFO[2024-12-19T12:10:45.316802803-06:00] loading plugin                                id=io.containerd.grpc.v1.cri type=io.containerd.grpc.v1
INFO[2024-12-19T12:10:45.316861173-06:00] Connect containerd service                   
INFO[2024-12-19T12:10:45.316916685-06:00] using experimental NRI integration - disable nri plugin to prevent this 
DEBU[2024-12-19T12:10:45.361817960-06:00] runtime "runc" supports recursive read-only mounts 
DEBU[2024-12-19T12:10:45.361848612-06:00] runtime "runc" supports CRI userns: false    
INFO[2024-12-19T12:10:45.362067964-06:00] Start subscribing containerd event           
INFO[2024-12-19T12:10:45.362162236-06:00] Start recovering state                       
INFO[2024-12-19T12:10:45.362503534-06:00] serving...                                    address=/run/containerd/containerd.sock.ttrpc
INFO[2024-12-19T12:10:45.362573489-06:00] serving...                                    address=/run/containerd/containerd.sock
DEBU[2024-12-19T12:10:45.363628821-06:00] Loaded sandbox {Metadata:{ID:e6a7d015b8474c9b22b5950aec621ef248b21a1207cb84ceeb6bf466e9b525b7 Name:busybox-sandbox_default_hdishd83djaidwnduwk28bcsb_1 Config:&PodSandboxConfig{Metadata:&PodSandboxMetadata{Name:busybox-sandbox,Uid:hdishd83djaidwnduwk28bcsb,Namespace:default,Attempt:1,},Hostname:,LogDirectory:,DnsConfig:nil,PortMappings:[]*PortMapping{},Labels:map[string]string{},Annotations:map[string]string{},Linux:&LinuxPodSandboxConfig{CgroupParent:,SecurityContext:nil,Sysctls:map[string]string{},Overhead:nil,Resources:nil,},Windows:nil,} NetNSPath:/var/run/netns/cni-18c5d1a2-6d5a-29c8-b147-999c4433ef39 IP:10.88.0.32 AdditionalIPs:[2001:4860:4860::20] RuntimeHandler: CNIResult:0xc000696370 ProcessLabel:} Status:0xc0001ee2a0 Container:0xc00068c0e0 Sandboxer:podsandbox NetNS:0xc0002daf30 StopCh:0xc0001f1200 Stats:<nil>} 
DEBU[2024-12-19T12:10:45.373067168-06:00] Loaded container {Metadata:{ID:95828b1b1c6ceaca528c2008c35d1941e03335e0464e0cc61421cb917b7d0977 Name:busybox_busybox-sandbox_default_hdishd83djaidwnduwk28bcsb_0 SandboxID:e6a7d015b8474c9b22b5950aec621ef248b21a1207cb84ceeb6bf466e9b525b7 Config:&ContainerConfig{Metadata:&ContainerMetadata{Name:busybox,Attempt:0,},Image:&ImageSpec{Image:busybox:1.35.0,Annotations:map[string]string{},UserSpecifiedImage:busybox:1.35.0,RuntimeHandler:,},Command:[top],Args:[],WorkingDir:,Envs:[]*KeyValue{},Mounts:[]*Mount{},Devices:[]*Device{},Labels:map[string]string{},Annotations:map[string]string{},LogPath:,Stdin:false,StdinOnce:false,Tty:false,Linux:&LinuxContainerConfig{Resources:nil,SecurityContext:nil,},Windows:nil,CDIDevices:[]*CDIDevice{},} ImageRef:sha256:0c00acac9c2794adfa8bb7b13ef38504300b505a043bf68dff7a00068dcc732b LogPath: StopSignal: ProcessLabel:} Status:0xc00036c100 Container:0xc0007881c0 IO:<nil> StopCh:0xc0001f0a68 IsStopSignaledWithTimeout:0xc0001581a0 Stats:<nil>} 
DEBU[2024-12-19T12:10:45.379176699-06:00] Loaded image "sha256:6270bb605e12e581514ada5fd5b3216f727db55dc87d5889c790e4c760683fee" 
DEBU[2024-12-19T12:10:45.380860492-06:00] Loaded image "docker.io/library/busybox:1.35.0" 
DEBU[2024-12-19T12:10:45.381672580-06:00] Loaded image "registry.k8s.io/pause:3.9"     
DEBU[2024-12-19T12:10:45.382735678-06:00] Loaded image "registry.k8s.io/pause@sha256:3d380ca8864549e74af4b29c10f9cb0956236dfb01c40ca076fb6c37253234db" 
DEBU[2024-12-19T12:10:45.383665155-06:00] Loaded image "sha256:e6f1816883972d4be47bd48879a08919b96afcd344132622e4d444987919323c" 
DEBU[2024-12-19T12:10:45.384496459-06:00] Loaded image "registry.k8s.io/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097" 
DEBU[2024-12-19T12:10:45.385428744-06:00] Loaded image "registry.k8s.io/pause:3.6"     
DEBU[2024-12-19T12:10:45.386619629-06:00] Loaded image "registry.k8s.io/coredns/coredns:v1.11.1" 
DEBU[2024-12-19T12:10:45.387503236-06:00] Loaded image "sha256:cbb01a7bd410dc08ba382018ab909a674fb0e48687f0c00797ed5bc34fcc6bb4" 
DEBU[2024-12-19T12:10:45.389029206-06:00] Loaded image "docker.io/library/busybox@sha256:02289a9972c5024cd2f083221f6903786e7f4cb4a9a9696f665d20dd6892e5d6" 
DEBU[2024-12-19T12:10:45.389899900-06:00] Loaded image "registry.k8s.io/coredns/coredns@sha256:1eeb4c7316bacb1d4c8ead65571cd92dd21e27359f0d4917f1a5822a73b75db1" 
DEBU[2024-12-19T12:10:45.392043091-06:00] Loaded image "docker.io/library/nginx@sha256:ea97e6aace270d82c73da382ea1a8c42d44b9dc11b55159104e21c49c687e7fb" 
DEBU[2024-12-19T12:10:45.394190622-06:00] Loaded image "docker.io/library/nginx:latest" 
DEBU[2024-12-19T12:10:45.395752857-06:00] Loaded image "sha256:0c00acac9c2794adfa8bb7b13ef38504300b505a043bf68dff7a00068dcc732b" 
DEBU[2024-12-19T12:10:45.397705565-06:00] Loaded image "sha256:247f7abff9f7097bbdab57df76fedd124d1e24a6ec4944fb5ef0ad128997ce05" 
INFO[2024-12-19T12:10:45.397993799-06:00] Start event monitor                          
INFO[2024-12-19T12:10:45.398018360-06:00] Start cni network conf syncer for default    
INFO[2024-12-19T12:10:45.398029840-06:00] Start streaming server                       
INFO[2024-12-19T12:10:45.398048749-06:00] Registered namespace "k8s.io" with NRI       
INFO[2024-12-19T12:10:45.398063782-06:00] runtime interface starting up...             
INFO[2024-12-19T12:10:45.398073226-06:00] starting plugins...                          
DEBU[2024-12-19T12:10:45.398596992-06:00] sd notification                               notified=false state="READY=1"
INFO[2024-12-19T12:10:45.398624749-06:00] containerd successfully booted in 0.106084s  

@buroa
Copy link

buroa commented Dec 19, 2024

@mikebrow It is controlled via Talos.

/etc/cri/containerd.toml:

version = 3

disabled_plugins = [
    "io.containerd.nri.v1.nri",
    "io.containerd.internal.v1.tracing",
    "io.containerd.snapshotter.v1.blockfile",
    "io.containerd.tracing.processor.v1.otlp",
]

imports = [
    "/etc/cri/conf.d/cri.toml",
]

[debug]
level = "info"
format = "json"

/etc/cri/conf.d/cri.toml:

## /etc/cri/conf.d/00-base.part
## /etc/cri/conf.d/01-registries.part
## /etc/cri/conf.d/20-customization.part

version = 3

[plugins]
  [plugins.'io.containerd.cri.v1.images']
    discard_unpacked_layers = false

    [plugins.'io.containerd.cri.v1.images'.registry]
      config_path = '/etc/cri/conf.d/hosts'

      [plugins.'io.containerd.cri.v1.images'.registry.configs]

  [plugins.'io.containerd.cri.v1.runtime']
    [plugins.'io.containerd.cri.v1.runtime'.containerd]
      [plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes]
        [plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc]
          base_runtime_spec = '/etc/cri/conf.d/base-spec.json'

cri logs:

m0: {"level":"info","msg":"starting containerd","revision":"88aa2f531d6c2922003cc7929e51daf1c14caa0a","time":"2024-12-19T18:21:03.434500951Z","version":"v2.0.1"}
m0: {"id":"io.containerd.image-verifier.v1.bindir","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.440084583Z","type":"io.containerd.image-verifier.v1"}
m0: {"id":"io.containerd.internal.v1.opt","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.440124824Z","type":"io.containerd.internal.v1"}
m0: {"id":"io.containerd.warning.v1.deprecations","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.440382397Z","type":"io.containerd.warning.v1"}
m0: {"id":"io.containerd.content.v1.content","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.440405140Z","type":"io.containerd.content.v1"}
m0: {"id":"io.containerd.snapshotter.v1.native","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.440652921Z","type":"io.containerd.snapshotter.v1"}
m0: {"id":"io.containerd.snapshotter.v1.overlayfs","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.440679415Z","type":"io.containerd.snapshotter.v1"}
m0: {"id":"io.containerd.event.v1.exchange","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.440757543Z","type":"io.containerd.event.v1"}
m0: {"id":"io.containerd.monitor.task.v1.cgroups","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.440779112Z","type":"io.containerd.monitor.task.v1"}
m0: {"id":"io.containerd.metadata.v1.bolt","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.441000539Z","type":"io.containerd.metadata.v1"}
m0: {"level":"info","msg":"metadata content store policy set","policy":"shared","time":"2024-12-19T18:21:03.441017450Z"}
m0: {"id":"io.containerd.gc.v1.scheduler","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.665691156Z","type":"io.containerd.gc.v1"}
m0: {"id":"io.containerd.differ.v1.walking","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.665722325Z","type":"io.containerd.differ.v1"}
m0: {"id":"io.containerd.lease.v1.manager","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.665729328Z","type":"io.containerd.lease.v1"}
m0: {"id":"io.containerd.service.v1.containers-service","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.665833114Z","type":"io.containerd.service.v1"}
m0: {"id":"io.containerd.service.v1.content-service","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.665840920Z","type":"io.containerd.service.v1"}
m0: {"id":"io.containerd.service.v1.diff-service","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.665849280Z","type":"io.containerd.service.v1"}
m0: {"id":"io.containerd.service.v1.images-service","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.665854402Z","type":"io.containerd.service.v1"}
m0: {"id":"io.containerd.service.v1.introspection-service","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.665859332Z","type":"io.containerd.service.v1"}
m0: {"id":"io.containerd.service.v1.namespaces-service","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.665869284Z","type":"io.containerd.service.v1"}
m0: {"id":"io.containerd.service.v1.snapshots-service","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.665873486Z","type":"io.containerd.service.v1"}
m0: {"id":"io.containerd.shim.v1.manager","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.665877484Z","type":"io.containerd.shim.v1"}
m0: {"id":"io.containerd.runtime.v2.task","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.665882228Z","type":"io.containerd.runtime.v2"}
m0: {"id":"io.containerd.service.v1.tasks-service","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.666918419Z","type":"io.containerd.service.v1"}
m0: {"id":"io.containerd.grpc.v1.containers","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.667792234Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.grpc.v1.content","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.667860497Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.grpc.v1.diff","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.667922364Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.grpc.v1.events","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.667963744Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.grpc.v1.images","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.668002875Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.grpc.v1.introspection","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.668062520Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.grpc.v1.leases","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.668102305Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.grpc.v1.namespaces","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.668149195Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.sandbox.store.v1.local","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.668190211Z","type":"io.containerd.sandbox.store.v1"}
m0: {"id":"io.containerd.cri.v1.images","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.668235797Z","type":"io.containerd.cri.v1"}
m0: {"level":"info","msg":"Get image filesystem path \"/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs\" for snapshotter \"overlayfs\"","time":"2024-12-19T18:21:03.668593661Z"}
m0: {"level":"info","msg":"Start snapshots syncer","time":"2024-12-19T18:21:03.668633781Z"}
m0: {"id":"io.containerd.cri.v1.runtime","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.668705890Z","type":"io.containerd.cri.v1"}
m0: {"config":"{\"containerd\":{\"defaultRuntimeName\":\"runc\",\"runtimes\":{\"runc\":{\"runtimeType\":\"io.containerd.runc.v2\",\"runtimePath\":\"\",\"PodAnnotations\":null,\"ContainerAnnotations\":null,\"options\":{\"BinaryName\":\"\",\"CriuImagePath\":\"\",\"CriuWorkPath\":\"\",\"IoGid\":0,\"IoUid\":0,\"NoNewKeyring\":false,\"Root\":\"\",\"ShimCgroup\":\"\"},\"privileged_without_host_devices\":false,\"privileged_without_host_devices_all_devices_allowed\":false,\"baseRuntimeSpec\":\"/etc/cri/conf.d/base-spec.json\",\"cniConfDir\":\"\",\"cniMaxConfNum\":0,\"snapshotter\":\"\",\"sandboxer\":\"podsandbox\",\"io_type\":\"\"}},\"ignoreBlockIONotEnabledErrors\":false,\"ignoreRdtNotEnabledErrors\":false},\"cni\":{\"binDir\":\"/opt/cni/bin\",\"confDir\":\"/etc/cni/net.d\",\"maxConfNum\":1,\"setupSerially\":false,\"confTemplate\":\"\",\"ipPref\":\"\",\"useInternalLoopback\":false},\"enableSelinux\":false,\"selinuxCategoryRange\":1024,\"maxContainerLogSize\":16384,\"disableApparmor\":false,\"restrictOOMScoreAdj\":false,\"disableProcMount\":false,\"unsetSeccompProfile\":\"\",\"tolerateMissingHugetlbController\":true,\"disableHugetlbController\":true,\"device_ownership_from_security_context\":false,\"ignoreImageDefinedVolumes\":false,\"netnsMountsUnderStateDir\":false,\"enableUnprivilegedPorts\":true,\"enableUnprivilegedICMP\":true,\"enableCDI\":true,\"cdiSpecDirs\":[\"/etc/cdi\",\"/var/run/cdi\"],\"drainExecSyncIOTimeout\":\"0s\",\"ignoreDeprecationWarnings\":null,\"containerdRootDir\":\"/var/lib/containerd\",\"containerdEndpoint\":\"/run/containerd/containerd.sock\",\"rootDir\":\"/var/lib/containerd/io.containerd.grpc.v1.cri\",\"stateDir\":\"/run/containerd/io.containerd.grpc.v1.cri\"}","level":"info","msg":"starting cri plugin","time":"2024-12-19T18:21:03.669557614Z"}
m0: {"id":"io.containerd.podsandbox.controller.v1.podsandbox","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.672698307Z","type":"io.containerd.podsandbox.controller.v1"}
m0: {"id":"io.containerd.sandbox.controller.v1.shim","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.674117698Z","type":"io.containerd.sandbox.controller.v1"}
m0: {"id":"io.containerd.grpc.v1.sandbox-controllers","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.674194517Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.grpc.v1.sandboxes","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.674229389Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.grpc.v1.snapshots","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.674244996Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.streaming.v1.manager","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.674258253Z","type":"io.containerd.streaming.v1"}
m0: {"id":"io.containerd.grpc.v1.streaming","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.674274613Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.grpc.v1.tasks","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.674287879Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.transfer.v1.local","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.674305579Z","type":"io.containerd.transfer.v1"}
m0: {"id":"io.containerd.grpc.v1.transfer","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.674336693Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.grpc.v1.version","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.674353904Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.monitor.container.v1.restart","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.674367906Z","type":"io.containerd.monitor.container.v1"}
m0: {"id":"io.containerd.ttrpc.v1.otelttrpc","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.674980001Z","type":"io.containerd.ttrpc.v1"}
m0: {"id":"io.containerd.grpc.v1.healthcheck","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.674986527Z","type":"io.containerd.grpc.v1"}
m0: {"id":"io.containerd.grpc.v1.cri","level":"info","msg":"loading plugin","time":"2024-12-19T18:21:03.674990866Z","type":"io.containerd.grpc.v1"}
m0: {"level":"info","msg":"Connect containerd service","time":"2024-12-19T18:21:03.674994447Z"}
m0: {"level":"info","msg":"NRI service not found, NRI support disabled","time":"2024-12-19T18:21:03.675009240Z"}
m0: {"level":"info","msg":"Start subscribing containerd event","time":"2024-12-19T18:21:03.692998002Z"}
m0: {"level":"info","msg":"Start recovering state","time":"2024-12-19T18:21:03.693225460Z"}
m0: {"address":"/run/containerd/containerd.sock.ttrpc","level":"info","msg":"serving...","time":"2024-12-19T18:21:03.693346175Z"}
m0: {"address":"/run/containerd/containerd.sock","level":"info","msg":"serving...","time":"2024-12-19T18:21:03.693470428Z"}
m0: {"error":"unable to find sandbox \"198dd57200b65202ebf4a0de83772411e2fff2584b19ab2e4f9c2812ebdd4835\": not found","level":"error","msg":"failed to recover sandbox state","sandbox":"198dd57200b65202ebf4a0de83772411e2fff2584b19ab2e4f9c2812ebdd4835","tim
e":"2024-12-19T18:21:03.722923325Z"}
m0: {"error":"unable to find sandbox \"3943ce14bb4bf468a78d88e0ec55146b340cb7fec81e06881fc97fa501f81f2b\": not found","level":"error","msg":"failed to recover sandbox state","sandbox":"3943ce14bb4bf468a78d88e0ec55146b340cb7fec81e06881fc97fa501f81f2b","tim
e":"2024-12-19T18:21:03.722980125Z"}
m0: {"error":"unable to find sandbox \"4bab9618cadef9b6bd4c951a161aa6216c2cb47857e92da555ec5e85490ddd75\": not found","level":"error","msg":"failed to recover sandbox state","sandbox":"4bab9618cadef9b6bd4c951a161aa6216c2cb47857e92da555ec5e85490ddd75","tim
e":"2024-12-19T18:21:03.723015409Z"}
m0: {"error":"unable to find sandbox \"9fe74455d7c20887915c57eba694d1b5f94058c47a399b8bb9fdfe8b9113b00d\": not found","level":"error","msg":"failed to recover sandbox state","sandbox":"9fe74455d7c20887915c57eba694d1b5f94058c47a399b8bb9fdfe8b9113b00d","time":"2024-12-19T18:21:03.723099299Z"}
m0: {"error":"unable to find sandbox \"ab2842b2eaa8b921cd3765b9e9742aae7e4cb28d6bf47d2243e7b1eadf9088ca\": not found","level":"error","msg":"failed to recover sandbox state","sandbox":"ab2842b2eaa8b921cd3765b9e9742aae7e4cb28d6bf47d2243e7b1eadf9088ca","time":"2024-12-19T18:21:03.723130600Z"}
m0: {"error":"unable to find sandbox \"ad6748a056c8e9039a5f99abd04ddc1c2eebaab687a4fde5b87d38f5382012ff\": not found","level":"error","msg":"failed to recover sandbox state","sandbox":"ad6748a056c8e9039a5f99abd04ddc1c2eebaab687a4fde5b87d38f5382012ff","time":"2024-12-19T18:21:03.723164479Z"}
m0: {"error":"unable to find sandbox \"cacad76b3e8ffa31b5eafe101a2da4a1459082a733d1ce539cf73510ff75b675\": not found","level":"error","msg":"failed to recover sandbox state","sandbox":"cacad76b3e8ffa31b5eafe101a2da4a1459082a733d1ce539cf73510ff75b675","time":"2024-12-19T18:21:03.723193693Z"}
m0: {"level":"info","msg":"Start event monitor","time":"2024-12-19T18:21:04.026470728Z"}
m0: {"level":"info","msg":"Start cni network conf syncer for default","time":"2024-12-19T18:21:04.026575399Z"}
m0: {"level":"info","msg":"Start streaming server","time":"2024-12-19T18:21:04.026596422Z"}
m0: {"level":"info","msg":"containerd successfully booted in 0.592766s","time":"2024-12-19T18:21:04.026640435Z"}
m0: {"address":"unix:///run/containerd/s/962cf41f2ac48833a2be13852f068a1337459c1e8cb491130aa46184ec2f08a1","level":"info","msg":"connecting to shim kubelet","namespace":"system","protocol":"ttrpc","time":"2024-12-19T18:21:14.050198476Z","version":3}
m0: {"address":"unix:///run/containerd/s/c5a0037f7a540abcaab874cf70b1962448b8b6f36e28be60570eb3465684e0e5","level":"info","msg":"connecting to shim etcd","namespace":"system","protocol":"ttrpc","time":"2024-12-19T18:21:14.050276262Z","versio

@mikebrow
Copy link
Member

well now that you have containerd up you should be able to get kubelet to reconnect

@buroa
Copy link

buroa commented Dec 19, 2024

@mikebrow Once containerd gets in the bugged state, kubelet can never reconnect to it. It spams container runtime is down. I have provided as much information as I can here, for now I will wait until others receive this update and start to see issues as well.

@mikebrow
Copy link
Member

siderolabs/talos#9496 related perhaps

@buroa
Copy link

buroa commented Dec 19, 2024

Just to note: This is only happening when I reboot + have multus enabled. If I disable multus and reboot, the cluster comes up fine. Once the cluster settles... I can enable multus and nothing breaks. So it's a race somewhere on boot with cilium/multus/containerd.

@mikebrow
Copy link
Member

mikebrow commented Dec 19, 2024

is possible you have multiple problems.. as the sandbox store is also reporting errors in your cri log... note: containerd/containerd#10848 (comment)

@buroa
Copy link

buroa commented Dec 19, 2024

They happen on every Kubernetes cluster I have ever touched. Just look scary. Any time you reboot a node, it takes a minute or two to settle things.

@mikebrow
Copy link
Member

we have a config field in cni setup .. setup_serially set that to true ...

@mikebrow
Copy link
Member

    [plugins.'io.containerd.cri.v1.runtime'.containerd]
      default_runtime_name = 'runc'
      ignore_blockio_not_enabled_errors = false
      ignore_rdt_not_enabled_errors = false

      [plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes]
        [plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc]
          runtime_type = 'io.containerd.runc.v2'
          runtime_path = ''
          pod_annotations = []
          container_annotations = []
          privileged_without_host_devices = false
          privileged_without_host_devices_all_devices_allowed = false
          base_runtime_spec = ''
          cni_conf_dir = ''
          cni_max_conf_num = 0
          snapshotter = ''
          sandboxer = 'podsandbox'

          [plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc.options]
            BinaryName = ''
            CriuImagePath = ''
            CriuWorkPath = ''
            IoGid = 0
            IoUid = 0
            NoNewKeyring = false
            Root = ''
            ShimCgroup = ''

    [plugins.'io.containerd.cri.v1.runtime'.cni]
      bin_dir = '/opt/cni/bin'
      conf_dir = '/etc/cni/net.d'
      max_conf_num = 1
      setup_serially = false
      conf_template = ''
      ip_pref = ''

^ this guy

@buroa
Copy link

buroa commented Dec 19, 2024

@mikebrow On it, moment.

Update: still broke :(

@mikebrow
Copy link
Member

anyhow that would eliminate if there is a timing conflict setting them up in parallel

@MikeZappa87
Copy link
Contributor Author

If adding multus is causing the issue I would try to confirm that it’s a multus issue. One way you can do this is run cri-o, cilium and Multus. And if it doesn’t work I would try and use cri-o and cilium. If that works it’s a Multus issue. If that’s the case I would reach out to them as well

@buroa
Copy link

buroa commented Dec 23, 2024

@MikeZappa87 @mikebrow Testing this code with https://github.com/sasha-s/go-deadlock, I can confirm there is a potential deadlock that was not present in v1.1.10.

=== RUN   TestLibCNIType020
POTENTIAL DEADLOCK: Recursive locking:
current goroutine 22 lock 0x1400010c100
cni.go:159 go-cni.(*libcni).Networks { c.RLock() } <<<<<
cni.go:158 go-cni.(*libcni).Networks { func (c *libcni) Networks() []*Network { }
cni.go:226 go-cni.(*libcni).attachNetworks { var firstError error }
cni.go:175 go-cni.(*libcni).Setup { result, err := c.attachNetworks(ctx, ns) }
cni_test.go:89 go-cni.TestLibCNIType020 {  }

Previous place where the lock was grabbed (same goroutine)
cni.go:169 go-cni.(*libcni).Setup { c.RLock() } <<<<<
cni.go:168 go-cni.(*libcni).Setup { } }
cni_test.go:89 go-cni.TestLibCNIType020 {  }

FAIL    github.com/containerd/go-cni    0.012s
FAIL

@MikeZappa87
Copy link
Contributor Author

MikeZappa87 commented Dec 23, 2024

@mikebrow its probably the double lock in ready? I think one of my original commits had the ready without a lock after the function locks. So

ready ...
Lock

Do something

@buroa buroa mentioned this pull request Dec 23, 2024
buroa added a commit to buroa/go-cni that referenced this pull request Dec 23, 2024
Fixes potential deadlock in containerd#123

Signed-off-by: Steven Kreitzer <[email protected]>
@buroa
Copy link

buroa commented Dec 23, 2024

@MikeZappa87 The problem is the additional locks added. I'm not entirely sure why you added them, as they were not present before. You pretty much renamed the Status() function here to ready(), and then added additional lock mutexes inside Setup(), SetupSerially(), Remove(), and Check().

I removed the additional locks and confirm this removes the potential deadlock here: #125

@MikeZappa87
Copy link
Contributor Author

@MikeZappa87 The problem is the additional locks added. I'm not entirely sure why you added them, as they were not present before. You pretty much renamed the Status() function here to ready(), and then added additional lock mutexes inside Setup(), SetupSerially(), Remove(), and Check().

I removed the additional locks and confirm this removes the potential deadlock here: #125

I didn’t add them willingly :-) go ahead and fix the problem. I’m not going to touch code until next year.

@smira
Copy link
Contributor

smira commented Dec 24, 2024

Please see #126 for another attempt to fix this issue, with deadlock tests

smira added a commit to smira/pkgs that referenced this pull request Dec 24, 2024
smira added a commit to smira/pkgs that referenced this pull request Dec 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cri Container Runtime Interface (CRI) impact/changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants