-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: rdma exlusive handling #603
Conversation
Pull Request Test Coverage Report for Build 11577952041Details
💛 - Coveralls |
Note that |
pkg/devices/rdma.go
Outdated
default: | ||
return false | ||
} | ||
// Checking for netlink param for exclusive RDMA use case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add more information here why we need to check netlink param in this case. (requested by sebastian)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@SchSeba PTAL
In case a RDMA device in exclusive mode is in use by a Pod, the DP was not reporting it as a resource after DP restart. Following changes are introduced in RdmaSpec: - isRdma: in case of no rdma resources, check if netlink "enable_rdma" is available. - GetRdmaDeviceSpec: the device specs are retrieved dynamically and not on discovery stage as before. Dynamic RDMA specs computation vs on discovery, comes to solve following scenario for exlusive mode: - Discover RDMA device - Allocate to Pod (resources are hidden on host) - Restart DP pod - Deallocate - Reallocate Fixes k8snetworkplumbingwg#565 Signed-off-by: Fred Rolland <[email protected]>
@SchSeba can you PTAL? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this one and also add a functional test to cover this one in the operator https://github.com/k8snetworkplumbingwg/sriov-network-operator/pull/799/files#diff-909069834ea269a01a51f28b8830efebec16799766767ebcd01b58f966ddc5c5R226 (real mlx device is needed for the test to run)
In case a RDMA device in exclusive mode is in use by a Pod, the DP was not reporting it as a resource after DP restart.
Following changes are introduced in RdmaSpec:
Dynamic RDMA specs computation vs on discovery, comes to solve following scenario for exlusive mode:
Fixes #565