-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for allocating all VFs from a single PF (bin packing) #255
Comments
@sseetharaman6 you're right that kubelet randomly chooses one healthy device from the advertised pool (sriov_foo), so if all VFs from PFs are grouped as one pool, then it's not guaranteed which PF the allocated VF is from. you might want to group the VFs from single PF as one pool and request device directly from that pool. |
Yea, but say I have 2 VFs per PF and request for 3 VFs in the pod spec , advertising each PF as its own resource will make this pod unschedulable. |
In this case, you will need to put two resource requests in the pod spec, the first request 2 VF resource, the second request 1 VF resource. I understand this may not be exactly what you have asked for.
Thanks for linking the reference! |
Facing same |
@zshi-redhat we should be able to implement this on a per-pool level with some device pools "packers" and others marked as "spreaders". Is there anything else the preferred allocation could be used for that might fit in - or be more relevant even? |
@killianmuldoon I think we could have two, as you already mentioned, one for allocating the VFs evenly across multiple PFs (in the same pool), the other for allocating all VFs from one PF until it's exhausted, then the next PF. |
@zshi-redhat - this approach makes sense to me. is there work underway to add interface for |
I do not think anyone is working on this. It will be discussed at the next meeting of network and resource mgnt. |
Update: this was discussed on Monday meeting, we agreed to support this new API update in sriov device plugin. However, this is not currently assigned to anyone, please feel free to take it if you have interest working on this. |
@sseetharaman6 FYI, this feature is added via PR #267 if you'd like to do some testing or have any suggestions. |
First scenario: I have two PFs (PF-A、PF-B),and i define two resources (R-A、R-B). Then, I create the pod resquestes two resources (R-A:1,R-B:1). |
if you need to have two additional network interfaces for the Pod, configured by a supporting CNI plugin then IIRC only the second scenario will work. if you just want to have two VFs allocated to the pod (and no CNI conifg required) then sending traffic from different PFs (different uplinks) would probably be faster. there is also another consideration which affects performance, the NUMA alignment for memory, CPU and PCI. |
For first scenario, couldn't you just define two NADs (net-a, net-b) with associated DP selectors (PFName) each selecting an individual PF? Then in your network request annotation put in net-a and net-b. You get VF from each PF then. What am I missing? |
correction, i meant first scenario. having two network-attachment-definition each associated with a different resource will work. since multus would need to provide each attachment with a different DeviceID from same resource on CmdAdd call. |
I have sovle it First scenario! Thanks! @adrianchiris @martinkennelly |
What would you like to be added?
If I have a multiple PFs configured for SRIOV and advertised as the same resource pool (
sriov_foo
) , is it possible to enforce allocation of all VFs from a single PF before VFs from other PFs are allocated? It seems likepluginapi.AllocateRequest
is picking devicesIDs at random, so I am not sure if this is possible/ can be supported.What is the use case for this feature / enhancement?
The text was updated successfully, but these errors were encountered: