You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Would it be possible to convert cudaMallocPitch calls to cudaMalloc? I understand why cudaMallocPitch was chosen, but those limitations are not as noticeable today with larger cache sizes.
The main driver for this enhancement is for optimal functionality with DALI. DALI loads batches of images using cudaMalloc. The reason being that DALI is not concerned with what is being loaded and it could be something besides an image.
Currently, the image must be copied from its cudaMalloc location to the new cudaMallocPitch location. If CudaSift used cudaMalloc, operation could then be performed in-place. This would save memory and time.
The text was updated successfully, but these errors were encountered:
Would it be possible to convert cudaMallocPitch calls to cudaMalloc? I understand why cudaMallocPitch was chosen, but those limitations are not as noticeable today with larger cache sizes.
The main driver for this enhancement is for optimal functionality with DALI. DALI loads batches of images using cudaMalloc. The reason being that DALI is not concerned with what is being loaded and it could be something besides an image.
Currently, the image must be copied from its cudaMalloc location to the new cudaMallocPitch location. If CudaSift used cudaMalloc, operation could then be performed in-place. This would save memory and time.
The text was updated successfully, but these errors were encountered: