Replies: 0 comments 1 reply
-
Thrust does not expose block size or shmem allocations to users, these are always hardcoded for each algorithm/architecture. You'll need to write a custom CUDA kernel outside of Thrust if you want this level of control. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Is there a way to customize the kernel launch parameter for thrust algorithms?
thrust::for_each
always launches 512 CUDA threads per block. I am wondering if it something user can customize for performance tuning?Also related to launch parameters, but possible a new topic entirely. Is it possible to use shared memory in functor passed to
thrust::for_each
and if so, is dynamic shared memory possible and how to specify the size?Beta Was this translation helpful? Give feedback.
All reactions