Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infiniband support #10906

Open
TheQuantumFractal opened this issue Sep 13, 2024 · 4 comments
Open

Infiniband support #10906

TheQuantumFractal opened this issue Sep 13, 2024 · 4 comments
Labels
type: enhancement New feature or request

Comments

@TheQuantumFractal
Copy link

Description

I'm looking to do RDMA within gVisor containers and was curious if you support Infiniband or if this would be on the roadmap? Thanks!

Is this feature related to a specific bug?

No response

Do you have a specific solution in mind?

No response

@TheQuantumFractal TheQuantumFractal added the type: enhancement New feature or request label Sep 13, 2024
@kevinGC
Copy link
Collaborator

kevinGC commented Sep 13, 2024

There's no specific support for Infiniband. Can you help me understand what support would be needed? gVisor containers typically communicate through a virtual device (often veth). On a machine with an Infiniband NIC, packets would switch from veth to NIC without issue as far as I understand.

I don't know much about RDMA, but there's no special support for it in gVisor. I'm not sure whether it's needed, or whether having the underlying host support it is enough.

@ekzhang
Copy link
Contributor

ekzhang commented Sep 25, 2024

Hi @kevinGC, we think it would involve supporting the Infiniband verbs in libibverbs, which are operations that let you send and receieve data while bypassing the kernel networking stack.

There is a device called /dev/infiniband/uverbs0 but none of us are familiar with the internals yet unfortunately.

We've seen FreeFlow (https://github.com/Microsoft/Freeflow) from Microsoft and would be looking for something similar to maximize throughput.

@kevinGC
Copy link
Collaborator

kevinGC commented Sep 25, 2024

Having looked (maybe too) quickly at verbs, it should be possible to support if my understanding is correct. Thoughts:

  • Infiniband verbs are probably a bunch of ioctls for their special character device. We can support this: we'd make our own virtual per-container/pod /dev/infinibad/uverbs0 that understands and safety-checks ioctls. We'd also have syscall filters specific to Infiniband (e.g. GPUs).
  • Based on my super quick look at your links, I think libibverbs works by mapping in some shared memory for notification queues and packet data. This reminds me of XDP support, and so I think should work as well. We would need a link endpoint that speaks Infiniband verbs.

While the path to implementation seems reasonably clear, this is a significant chunk of work. The implementer would need to understand Infiniband verbs. I think we'd accept a PR for it, but for now it's not on the roadmap.

@ekzhang
Copy link
Contributor

ekzhang commented Sep 26, 2024

Sounds good, thank you for sharing your thoughts on the tractability of this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants