-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial changes to support cufile stream I/O. #259
Conversation
Pull requests from external contributors require approval from a |
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tell-rebanta. Overall, I think it looks good, but I am wondering if it makes sense to de-couple the CUDA stream from StreamHandle
?
As it is now, you are forced to create a stream per file, but I think a common use case is to read/write to multiple files using the same stream.
What about implementing read_async()
and write_async()
in the existing class FileHandle
instead? read_async()
and write_async()
could then take a CUstream
as argument.
Side note:
To check and fix the check-style
run:
pre-commit run --all-file
cpp/examples/basic_io.cpp
Outdated
|
||
kvikio::StreamHandle s_handle_wr("/tmp/test-file", "w", kvikio::StreamHandle::m644, true); | ||
check(cudaMemcpy(a_dev, a, SIZE, cudaMemcpyHostToDevice) == cudaSuccess); | ||
kvikio::buffer_register(a_dev, SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to register the a buffer in order to use it with stream IO?
- Moved async I/O APIs to file_handle and made stream_handle to be a derived class of file_handle. - Async I/O can be invoked either through file_handle or stream_handle. - Addeed basic I/O examples for both the ways. - Removed unnecessary logs in CMakeLists.txt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tell-rebanta, I like that the stream is now an explicit argument for read_async
and write_async
!
We need to guard the stream API using KVIKIO_CUFILE_STREAM_API_FOUND
and I suggest that we exclude StreamHandle
from this PR.
Let's discuss how to design the stream registration in a follow-up PR (I think we need to de-couple it from the file handle).
/ok to test |
Will be added as part of a separate check-in.
/ok to test |
* @param size Size in bytes to read. | ||
* @param file_offset Offset in the file to read from. | ||
* @param devPtr_offset Offset relative to the `devPtr_base` pointer to read into. | ||
* This parameter should be used only with registered buffers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to distinguish registered buffers in the type system at the moment? That would be preferable over this void *
C-like interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not ATM but it would be a cool addition: #266
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove the CUDA stream API
Co-authored-by: Lawrence Mitchell <[email protected]>
/ok to test |
Thanks @wence-, added you suggestions. |
/ok to test |
/merge |
Hi there, Thanks for this great repository! I want to use the cuFile async IO in my research project and noticed this kvikio repo. However, the initial support has been done in #259 and tracked in #204, but the Python interface hasn't been done yet. So I exported the write_async and read_async to the CuFile Python class and added test case. This will be very helpful for my project where I want to do the PyTorch training computation and simultaneously load tensors from the SSDs. I created this PR because hopefully, it could be helpful for your repository as well as keeping the Python interface current. Please let me know your thoughts. Thank you. Best Regards, Kun Authors: - Kun Wu (https://github.com/K-Wu) - Mads R. B. Kristensen (https://github.com/madsbk) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) URL: #376
No description provided.