You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've found that ib_write_lat doesn't support CUDA mode.
Wonder whether there is any intrinsic issue that prevents supporting this?
I think it should not be CUDA issue because NCCL library is using IB write with GPU.
If there isn't a big obstacle, I can help draft a PR to fix this.
The text was updated successfully, but these errors were encountered:
I've found that ib_write_lat doesn't support CUDA mode. Wonder whether there is any intrinsic issue that prevents supporting this? I think it should not be CUDA issue because NCCL library is using IB write with GPU. If there isn't a big obstacle, I can help draft a PR to fix this.
Can you share your PR link ?
I remove the error exit, and try to run on A100, it will be crash
and gdb showed that not host memory, so it could be CUDA memory issue
I've found that
ib_write_lat
doesn't support CUDA mode.Wonder whether there is any intrinsic issue that prevents supporting this?
I think it should not be CUDA issue because NCCL library is using IB write with GPU.
If there isn't a big obstacle, I can help draft a PR to fix this.
The text was updated successfully, but these errors were encountered: