Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/verbs: Add native API support for DMA-buf regions #9343

Merged
merged 3 commits into from
Sep 21, 2023
Merged

Conversation

shefty
Copy link
Member

@shefty shefty commented Sep 16, 2023

This expands the memory registration API to support passing in a DMA-buf region. DMA buf regions are specified using a struct fi_mr_dmabuf. The struct fi_mr_attr is updated to accept passing in the dmabuf structure as input, with a flag used to indicate that the registration is for a DMA-buf region.

The verbs provider is updated to support DMA-buf registration.

Changes are compile tested only. There's no test yet available to verify functionality.

@shefty shefty requested a review from j-xiong September 16, 2023 00:02
Comment on lines 413 to 416
return vrb_reg_dmabuf(domain, attr->dmabuf->fd,
attr->dmabuf->offset, attr->dmabuf->len,
vrb_mr_ofi2ibv_access(attr->access, domain),
mr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bypasses the reg cache.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was intentional because I don't know how we deal with a dma-buf in the cache or if we need to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good for now. In the long run, we may still want the registration be cached, probably by using a separate mr cache that use <fd, offset> instead of addr as the key.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that I don't know how you verify that the entry is still valid or eject it from the cache.

@shefty
Copy link
Member Author

shefty commented Sep 16, 2023

@j-xiong - It looks like we might be able to adapt fi_mr_reg_xe and fi_rdmabw-xe to verify the new registration call.

@j-xiong
Copy link
Contributor

j-xiong commented Sep 16, 2023

Yes, the existing component tests can be extended to have an option to use the new API.

@@ -740,6 +775,11 @@ The follow flag may be specified to any memory registration call.
API AllocHost call (as opposed to using malloc-like host allocation,
unified/shared memory allocation, or AllocDevice).

*FI_MR_DMABUF*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also be a capability bit so that the application can check if this supported by the provider?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, and I had the same thought. Can the current FI_HMEM work, which would require providers support FI_HMEM to also support dmabuf registration? I'd like to avoid capability bit overload when possible. But I'm not sure if FI_HMEM is sufficient.

I was thinking that a provider could call mmap() to map the region into the VA space, then treat the dmabuf the same as a VA registered region. The provider may need to discover what hmem API was used to allocate the region. Is there an issue if the provider were forced down this path? IOW, dmabuf support becomes an optimization that the provider could leverage, otherwise it needs to fallback to whatever its standard HMEM support is.

DMA-buf support is handled internally by some providers.  However, there's
no mechanism for an application to pass dma-buf mapped buffers directly
to them.  Add this to the MR API using the HMEM infrastructure.

Signed-off-by: Sean Hefty <[email protected]>
@shefty
Copy link
Member Author

shefty commented Sep 18, 2023

Updated: I changed the value of FI_MR_DMABUF, so that it could be defined as a capability bit in the future, if needed. However, I left it defined only as a memory registration flag. I added a requirement that the flag requires FI_HMEM support.

Copy link
Contributor

@j-xiong j-xiong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good enough for starting to write tests.

@shefty
Copy link
Member Author

shefty commented Sep 18, 2023

@j-xiong - I added an update to fi-mr-reg-xe. It's not tested (I couldn't even build that on my system, so I need to move my patches.) The Intel CI passed, but I'm not sure that test is built or run. Only verbs supports the new API directly, but it looks like rxm should pass the call through okay.

@j-xiong
Copy link
Contributor

j-xiong commented Sep 18, 2023

The test passed:

$ ./fi_xe_mr_reg -m device -S 1048576
Using first driver: 0x2fc8950 (total >= 1)
using GPU device 0: 0x2d89fd0 (total 2)
Using OFI device: verbs;ofi_rxm (mlx5_0)
mr 0x30c4dc0, buf 0xff007fffffef0000, rkey 0x175916, len 1048576
mr 0x30c4dc0, buf 0xff007fffffef0000, rkey 0x175916, len 1048576

Comment on lines 194 to 195
printf("mr %p, buf %p, rkey 0x%lx, len %zd\n",
mr, buf, fi_mr_key(mr), buf_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should print out dmabuf_mr instead of mr, same for the mr key.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - updated. And thanks for testing. :)

@j-xiong
Copy link
Contributor

j-xiong commented Sep 18, 2023

The rkey doen't look right:

 ./fi_xe_mr_reg -m device -S 1048576
Using first driver: 0x2bda950 (total >= 1)
using GPU device 0: 0x299bfd0 (total 2)
Using OFI device: verbs;ofi_rxm (mlx5_0)
mr 0x2d034e0, buf 0xff007fffffef0000, rkey 0x1738f7, len 1048576
mr 0x2d03690, buf 0xff007fffffef0000, rkey 0x0, len 1048576

@shefty
Copy link
Member Author

shefty commented Sep 18, 2023

I see the issue. The libfabric fid_mr wasn't initialized after the register call returned.

@shefty
Copy link
Member Author

shefty commented Sep 19, 2023

I need to either go through vrb_mr_reg_common() or duplicate the functionality after ibv_reg_mr().

Support callers using dma-buf registatration directly.

Signed-off-by: Sean Hefty <[email protected]>
Add direct testing of dmabuf memory registration.

Signed-off-by: Sean Hefty <[email protected]>
@shefty
Copy link
Member Author

shefty commented Sep 20, 2023

Restructured the verbs changes to go through vrb_mr_reg_common(), so that the memory region is properly initialized.

@j-xiong
Copy link
Contributor

j-xiong commented Sep 20, 2023

Works fine now:

$ ./fi_xe_mr_reg -m device -S 1048576
Using first driver: 0x27c9950 (total >= 1)
using GPU device 0: 0x258afd0 (total 2)
Using OFI device: verbs;ofi_rxm (mlx5_0)
mr 0x28df930, buf 0xff007fffffef0000, rkey 0x1726e5, len 1048576
mr 0x28f2450, buf 0xff007fffffef0000, rkey 0x17f6b5, len 1048576

@shefty
Copy link
Member Author

shefty commented Sep 21, 2023

bot:aws:retest

@shefty shefty merged commit d8e987a into ofiwg:main Sep 21, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants