Busy-wait inside cudaDeviceSynchronize while calling Thrust algorithms #1053
-
Hello. Not sure if this is the right place to ask a question. But I've inherited an app to profile on x64 Linux, which makes use of Thrust (specifically Is there some way of telling Thrust to sleep the CPU thread instead while waiting for the result from the GPU? I've read the docs I could find but couldn't spot anything about making this choice. Does Thrust just use the default Cuda stream? Perhaps there's a way to set a flag globally on that? I'm familiar with GPU programming but haven't much experience with Cuda yet, so I might be missing something. |
Beta Was this translation helpful? Give feedback.
Replies: 0 comments 3 replies
-
The wait behavior is controlled by Thrust uses the default stream by default, but this can be changed by passing If you want more control over synchronization, you may also want to look at |
Beta Was this translation helpful? Give feedback.
The wait behavior is controlled by
cudaSetDeviceFlags
(docs). ThecudaDeviceScheduleYield
should do what you're looking for.Thrust uses the default stream by default, but this can be changed by passing
thrust::device.on(stream)
as the first argument to the algorithm, wherestream
is acudaStream_t
.If you want more control over synchronization, you may also want to look at
cub::DeviceScan::ExclusiveScan
(docs). While Thrust algorithms are inherently synchronous, the CUB algorithms provide the user with more flexibility in this regard.