-
Notifications
You must be signed in to change notification settings - Fork 41
RMA WG 01 25 2021
Attendees: Manju, Dave, Khaled, Wasi, Nick, Naveen, Min
- Go over remaining active topics and discuss & update plan
-
Signal
-
Naveen: Atomic implementation concerns: It can degrade performance of
put_with_signal
if makesignal
be the same atomicity as that of former because they have to be transferred via the same EP. Then the user (or library) has to explicitly call fence beforesignal
to ensure ordering with previous RMA/AMO operations. -
Manju: need define the semantics of signal: (1) Meaning of fence? (2) Difference between signal and AMO? (3) How it helps GPU comm? (4) Now we have three atomic modes: AMO with
SHMEM_TEAM_WORLD
domain, AMO withSHMEM_TEAM_SHARED
domain,put_with_signal
. Willsignal
be the 4th? -
Khaled: conditional check on GPU will cause suboptimal perf if replace
signal
withput_with_signal(0 byte)
-
Nick: considering released memory ordering model with signal? E.g., the user application already guarantees ordering between signal and the previous RMA/AMO, then the library does not have to call
fence
for ordering. -
Next step: will follow up discussion on ticket #382
-
-
GPU
- Khaled: Have discussed with Jim. May consider two aspects:
- What functions are needed for GPU to match existing API:
- CPU prepared:
stream triggered
,kernel triggered
- GPU prepared:
kernel initiated
- CPU prepared:
- Multilevel memory model on GPU
- What functions are needed for GPU to match existing API:
- Manju:
event triggered
=={stream|kernel} triggered
- Trying to decouple invocation|execution semantics
- May be useful semantics for upcoming(current) smartNIC
- Min:
event triggered
might be similar to OMP task model. Allowing user to define dependency (e.g., ordering of streams) - Khaled: How to define the new semantics?
- If we reuse existing APIs (e.g., adding an event parameter in
put
), it will be hard for user to follow if using the same set of APIs for two semantics. - If do not reuse, then too many extensions.
- If we reuse existing APIs (e.g., adding an event parameter in
- Min: Define two distinct models? one for regular and the other for event-triggered?
- Khaled: What if I want to use event/regular models same time (e.g., both GPU and CPU communicate)
- Wasi: We can tread CPU also as a device
- Khaled: Have discussed with Jim. May consider two aspects:
- Go over the remaining active topics and discuss any specific item
-
Working Groups
-
Errata