-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add internal wrapper for cuda driver APIs #2070
Add internal wrapper for cuda driver APIs #2070
Conversation
🟩 CI finished in 2h 47m: Pass: 100%/56 | Total: 2h 35m | Avg: 2m 46s | Max: 11m 50s | Hits: 90%/1693
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
pycuda |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
+/- | pycuda |
🏃 Runner counts (total jobs: 56)
# | Runner |
---|---|
41 | linux-amd64-cpu16 |
9 | linux-amd64-gpu-v100-latest-1 |
4 | linux-arm64-cpu16 |
2 | windows-amd64-cpu16 |
if (status != CUDA_SUCCESS) | ||
{ | ||
::cuda::__throw_cuda_error(static_cast<cudaError_t>(status), err_msg); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want something like _CCCL_TRY_CUDA_API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could also be a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function should be more or less equivalent to _CCCL_TRY_CUDA_API, am I missing some key difference here? I would have no issues turning it into a macro instead if its preffered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe a function is "cleaner" than a macro, but the macro cannot go as we cannot depend on cudax.
Otherwise we would need to move the function into libcu++
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might need two separate functions/macros because driver API returns CUresult and runtime returns cudaError_t.
But these have the same values, so maybe we can add a cast to _CCCL_TRY_CUDA_API and remove this function 🤔
{ | ||
static auto driver_fn = CUDAX_GET_DRIVER_FUNCTION(cuCtxPushCurrent); | ||
call_driver_fn(driver_fn, "Failed to push context", ctx); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we dynamically loading these functions instead of including <cuda.h>
and linking to libcuda
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would need to require -lcuda compilation flag otherwise. This is more in line with the current CUDA runtime which does not require the compilation flag. There are compatibility reasons why current CUDA runtime does that and we probably want the same thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
directly linking to libcuda.so
means that any consuming library would only run on machines with the CUDA driver installed. This would mean that any application with runtime logic to dispatch to CUDA vs CPU based on HW support would fail to load when launched on a machine without the CUDA driver.
From a build engineer standpoint linking to libcuda.so should never happen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool thanks. i knew there must be a reason. TIL
cudax/test/CMakeLists.txt
Outdated
@@ -57,4 +57,8 @@ foreach(cn_target IN LISTS cudax_TARGETS) | |||
launch/configuration.cu | |||
) | |||
target_compile_options(${test_target} PRIVATE $<$<COMPILE_LANG_AND_ID:CUDA,NVIDIA>:--extended-lambda>) | |||
|
|||
Cudax_add_catch2_test(test_target misc_tests ${cn_target} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cudax_add_catch2_test(test_target misc_tests ${cn_target} | |
cudax_add_catch2_test(test_target misc_tests ${cn_target} |
🟩 CI finished in 2h 28m: Pass: 100%/56 | Total: 2h 31m | Avg: 2m 42s | Max: 12m 40s | Hits: 97%/2408
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
pycuda |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
+/- | pycuda |
🏃 Runner counts (total jobs: 56)
# | Runner |
---|---|
41 | linux-amd64-cpu16 |
9 | linux-amd64-gpu-v100-latest-1 |
4 | linux-arm64-cpu16 |
2 | windows-amd64-cpu16 |
* Add a header to interact with driver APIs * Add a test for the driver API interaction * Format * Fix formatting
* Add a header to interact with driver APIs * Add a test for the driver API interaction * Format * Fix formatting
Adds internal header that loads CUDA driver API functions from the cuda runtime.
It also adds a few first entries needed for current context management.
Each new function should be a function that loads the driver entry point with CUDAX_GET_DRIVER_FUNCTION and then calls it with proper arguments.