-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CudaIpc 2/3]: Ipc handle exchange #3910
base: add_backend_type_to_p2p_comm
Are you sure you want to change the base?
Conversation
Review updated until commit c047576 Description
Changes walkthrough 📝
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
!test |
storage_offset_(tensor.storage_offset()), | ||
element_size_(tensor.element_size()), | ||
rank_(Communicator::getInstance().deviceId()) { | ||
NVFUSER_CUDA_RT_SAFE_CALL( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert that the tensor is not strided
|
||
std::unordered_map<KeyType, std::unique_ptr<P2pIpcHandle>, KeyHash, KeyEqual> | ||
handles_; | ||
std::unordered_set<std::string> keys_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove (unnecessary)
} | ||
|
||
private: | ||
using KeyType = std::tuple<int64_t, at::Tensor, P2PCommunication*>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we don't need P2PCommunication*
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually need it in the following case:
rank 0 sends a buffer to rank 1's buffer1
and concurrently ,
rank 0 sends the same buffer to rank 1's buffer2
On top of
prerequesite to:
What
Expr
nodehir::ShareMemHandles
to represent this op. We cannot embed the op in the Send/Recv semantics because we need to group the handle exchange between matching sends and recv to avoid deadlocksHow
Most of the implementation is in
multidevice/ipc_handle.cpp
IpcHandle
representing the ipc handle that is exchanged. This class is supplemented with a semaphore, which is a local cuda buffer allocated on the exporter's device.IpcHandleCache
which handles exchanging and caching the ipc handles. Caching is made on with respect to a combination of runtime and symbolic ingredients:(runtime peer, at::Tensor, Expr*)
. This caching allows to have arbitrary number of p2p comms between pairs of ranks.