-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DRAFT] Implement CNI Status #114
Conversation
cni.go
Outdated
// Status checks the status of the cni initialization | ||
Status() error | ||
// Status executes the status verb of the cni plugin | ||
Status(ctx context.Context) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great, it already existed the concept
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Status detected if the minimum number of networks was met. In containerd cri-plugin, its hard coded to 2 neworks. Loopback and Default CNI network configuration. I changed the existing to 'Ready' and have Status calling the status function out of libcni.
cni.go
Outdated
} | ||
|
||
for _, network := range c.Networks() { | ||
err = network.Status(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is something you may need to document, as this means if any of the networks is not ready, the status is not ready
You may want to do, if just one network is ready then status is ready 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the cri-plugin case, we load two networks. One is the loopback plugin as separate (
Line 88 in 6603d5b
func WithLoNetwork(c *libcni) error { |
The other 'network' is the first CNI network configuration in /etc/cni/net.d.
Since this is hard coded in containerd, we probably don't need to worry about this. However we could have a configurable failure behavior. If 1 fails, they all fail, if one is successful the node has the network ready condition set. We could also return a structure that has the CNI network name and error? This would return a slice of this structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikebrow thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If one is the loopback plugin, taht is always going to be ready and not be able to provide connectivity, then it makes sense for ready to be an AND of all plugins instead of an OR, as you will need the loopback AND the other CNI to be able to provide networking to the pods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if this should be left up to the caller of the Status function? In this case for the containerd cri plugin I think it makes sense for AND however go-cni is used by others as well, we probably shouldn't dictate behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be AND until we have a config or parameter from the client indicating "optional"
@aojea Since the kubelet calls the CRI-API Status RPC: https://github.com/containerd/containerd/blob/6333db77015764f7d45b407094a3bc2cc05c239b/internal/cri/server/status.go#L49 The Status returning an error will set the node to not ready. I did notice after about 2 hours, the pods with netns, started going to completed. It might have been my test however I think we should discuss. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's do it with a v2 api.. since v2 is not compatible with v1...
// Status checks the status of the cni initialization | ||
Status() error | ||
// Status executes the status verb of the cni plugin | ||
Status(ctx context.Context) ([]*NetworkStatus, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StatusDetail(ctx)...?
} | ||
|
||
return networks, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check.. eol at elf
type NetworkStatus struct { | ||
Network *Network | ||
Status error | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...
Hi @MikeZappa87, do you plan to work on this? If not, I can try to work on it as a new contributor. |
Hello, I need to finish this up. I started it and we needed to get a few PR's merged prior to finishing this. If you want to take it on feel free! :-) |
Thanks, I'll give it a shot! |
Hi @MikeZappa87, a couple basic questions for you if you have time:
More generally, it would be helpful to know what work is remaining and if we need to wait for any other PRs to be merged first. Thanks! |
that is the bug we want to fix, though CNI implementations need to implement CNI STATUS wisely as with great power comes great responsibility |
@aojea could you say a bit more about what the "bug" here is? Is it that "the pods with netns started going to completed" would be a kubelet bug? |
@MikeZappa87 can we wake this back up? |
@squeed yes we can! |
The recent CNI specification provided a couple new verbs. This is to implement Status. GC and other verbs will be added in the future in different PR.
https://github.com/containernetworking/cni/blob/main/SPEC.md