Q about CUDA API:cudaHostAlloc #373
-
I try to use cudaHostAlloc to allocate the space for GPU and CPU. Does anyone know the difference between cudaHostAllocMapped and cudaHostAllocDefault? The documentation said that cudaHostAllocDefault need to use cudamemcpy to copy data from host to device.But I did not use cudamemcpy and cuda still can compute and get the right result. So what is the reason. Thank u. and transfer the dst to cv::Mat result. and we get the right result. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Now, with address mapping CUDA kernels can directly access host pinned memory without extra copy to the device memory, which explains what you saw. This is possible because, since the memory is pinned/page-locked, the OS guarantees that during the lifetime of the memory there's no page swapping happening while the kernel is doing its work, and so reading/writing this memory from device is safe. There are a few potential issues (or benefits, depending on your problem needs):
|
Beta Was this translation helpful? Give feedback.
cudaHostAllocMapped
is automatically set if you only use CUDA runtime APIs (or CUDA driver APIs with explicit use of the primary context), since the primary context has this flag set by default on devices supporting address mapping.Now, with address mapping CUDA kernels can directly access host pinned memory without extra copy to the device memory, which explains what you saw. This is possible because, since the memory is pinned/page-locked, the OS guarantees that during the lifetime of the memory there's no page swapping happening while the kernel is doing its work, and so reading/writing this memory from device is safe.
There are a few potential issues (or benefits, depending on your p…