Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial changes to support cufile stream I/O. #259

Merged
merged 8 commits into from
Aug 22, 2023

Conversation

tell-rebanta
Copy link
Contributor

No description provided.

@tell-rebanta tell-rebanta requested review from a team as code owners July 31, 2023 17:51
@rapids-bot
Copy link

rapids-bot bot commented Jul 31, 2023

Pull requests from external contributors require approval from a rapidsai organization member with write permissions or greater before CI can begin.

@quasiben
Copy link
Member

/ok to test

@madsbk madsbk added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Aug 1, 2023
Copy link
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tell-rebanta. Overall, I think it looks good, but I am wondering if it makes sense to de-couple the CUDA stream from StreamHandle?

As it is now, you are forced to create a stream per file, but I think a common use case is to read/write to multiple files using the same stream.

What about implementing read_async() and write_async() in the existing class FileHandle instead? read_async() and write_async() could then take a CUstream as argument.

Side note:

To check and fix the check-style run:

pre-commit run --all-file

cpp/CMakeLists.txt Outdated Show resolved Hide resolved

kvikio::StreamHandle s_handle_wr("/tmp/test-file", "w", kvikio::StreamHandle::m644, true);
check(cudaMemcpy(a_dev, a, SIZE, cudaMemcpyHostToDevice) == cudaSuccess);
kvikio::buffer_register(a_dev, SIZE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to register the a buffer in order to use it with stream IO?

 - Moved async I/O APIs to file_handle and made stream_handle to be a
   derived class of file_handle.
 - Async I/O can be invoked either through file_handle or stream_handle.
 - Addeed basic I/O examples for both the ways.
 - Removed unnecessary logs in CMakeLists.txt.
@madsbk madsbk changed the base branch from branch-23.08 to branch-23.10 August 16, 2023 09:00
Copy link
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tell-rebanta, I like that the stream is now an explicit argument for read_async and write_async!
We need to guard the stream API using KVIKIO_CUFILE_STREAM_API_FOUND and I suggest that we exclude StreamHandle from this PR.
Let's discuss how to design the stream registration in a follow-up PR (I think we need to de-couple it from the file handle).

cpp/include/kvikio/file_handle.hpp Show resolved Hide resolved
cpp/include/kvikio/file_handle.hpp Show resolved Hide resolved
cpp/include/kvikio/stream.hpp Outdated Show resolved Hide resolved
@madsbk
Copy link
Member

madsbk commented Aug 16, 2023

/ok to test

Will be added as part of a separate check-in.
@madsbk
Copy link
Member

madsbk commented Aug 17, 2023

/ok to test

cpp/include/kvikio/file_handle.hpp Outdated Show resolved Hide resolved
cpp/include/kvikio/file_handle.hpp Outdated Show resolved Hide resolved
cpp/include/kvikio/file_handle.hpp Outdated Show resolved Hide resolved
* @param size Size in bytes to read.
* @param file_offset Offset in the file to read from.
* @param devPtr_offset Offset relative to the `devPtr_base` pointer to read into.
* This parameter should be used only with registered buffers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to distinguish registered buffers in the type system at the moment? That would be preferable over this void * C-like interface.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not ATM but it would be a cool addition: #266

@madsbk
Copy link
Member

madsbk commented Aug 18, 2023

/ok to test

Copy link
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the CUDA stream API

cpp/include/kvikio/shim/cuda.hpp Outdated Show resolved Hide resolved
cpp/include/kvikio/shim/cuda.hpp Outdated Show resolved Hide resolved
@madsbk
Copy link
Member

madsbk commented Aug 21, 2023

/ok to test

@madsbk
Copy link
Member

madsbk commented Aug 21, 2023

Thanks @wence-, added you suggestions.

@madsbk
Copy link
Member

madsbk commented Aug 21, 2023

/ok to test

@madsbk
Copy link
Member

madsbk commented Aug 22, 2023

/merge

@rapids-bot rapids-bot bot merged commit 35424d2 into rapidsai:branch-23.10 Aug 22, 2023
27 checks passed
rapids-bot bot pushed a commit that referenced this pull request May 7, 2024
Hi there,

Thanks for this great repository! I want to use the cuFile async IO in my research project and noticed this kvikio repo. However, the initial support has been done in #259 and tracked in #204, but the Python interface hasn't been done yet. So I exported the write_async and read_async to the CuFile Python class and added test case. This will be very helpful for my project where I want to do the PyTorch training computation and simultaneously load tensors from the SSDs. I created this PR because hopefully, it could be helpful for your repository as well as keeping the Python interface current.

Please let me know your thoughts. Thank you.

Best Regards,
Kun

Authors:
  - Kun Wu (https://github.com/K-Wu)
  - Mads R. B. Kristensen (https://github.com/madsbk)

Approvers:
  - Mads R. B. Kristensen (https://github.com/madsbk)

URL: #376
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants