-
Notifications
You must be signed in to change notification settings - Fork 388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libfabric version2: Initial set of proposed changes #9384
Conversation
3be6baf
to
df4d8bd
Compare
Update: adjusting patches to start working through CI issues (many of which are related to building on other platforms and missing updates to the CI scripts) |
@zachdworkin - earlier Intel CI failed on socket provider testing, which makes sense, since I removed that provider. Is there some way I can disable sockets from being run in the contrib/intel scripts, or would this require updates to the CI system? (I'm NOT asking to disable anything in the CI system now -- that wouldn't be until we're closer to accepting these sort of changes). |
The only way I know that you can do this is with the replay function and either comment out that stage or just delete it from your replay. Otherwise I will need to remove it from the pipeline with a different PR |
8a2af9d
to
9c4caa2
Compare
mpich references FI_NAME_MAX, so removing that from the public header breaks that build. We either need to add it back in and document what it actually means, leave it purely for compatibility, or update mpich to use an internal definition. |
5dabf05
to
412a65a
Compare
Provider only supported by 1.x series Signed-off-by: Sean Hefty <[email protected]>
Provider only supported by v1.x series Signed-off-by: Sean Hefty <[email protected]>
Signed-off-by: Sean Hefty <[email protected]>
GNI is only supported by the v1.x series Signed-off-by: Sean Hefty <[email protected]>
NetworkDirect support is supported by verbs provider. Signed-off-by: Sean Hefty <[email protected]>
ffd3a52
to
dc7faf6
Compare
Signed-off-by: Sean Hefty <[email protected]>
Signed-off-by: Sean Hefty <[email protected]>
Signed-off-by: Sean Hefty <[email protected]>
Signed-off-by: Sean Hefty <[email protected]>
Remove support for asynchronous insertions and AV_MAP. The format of the fi_addr_t value will either be indexed based in the standard case or provider defined in more advanced use cases, based on the AV configuration (such as using auth_keys). Signed-off-by: Sean Hefty <[email protected]>
Remove FI_BUFFERED_RECV as an exported API option. Since it's currently used internally between mrail and rxm, make it an internal only option. It has a limited use case for multirail over rxm over connected endpoints where shared receive queues are not available. With shared receive queues, the feature wouldn't be needed, as mrail could own the buffers outright. Signed-off-by: Sean Hefty <[email protected]>
Recommend that applications and providers use FI_THREAD_COMPLETION as the preferred threading model for lockless operation when using scalable endpoints. This helps align application design with the provider implementation. Signed-off-by: Sean Hefty <[email protected]>
Remove overly complicated threading models and focus on specific models to allow better alignment between application designs and provider implementation. Use FI_THREAD_DOMAIN as the preferred lockless threading model for standard endpoints. Signed-off-by: Sean Hefty <[email protected]>
Combine data and control progress into one progress option. Signed-off-by: Sean Hefty <[email protected]>
Completions are always unordered. Signed-off-by: Sean Hefty <[email protected]>
Field was deprecated and only serves as a placeholder for compatility. Signed-off-by: Sean Hefty <[email protected]>
Support for wait sets adds significant complexity to the provider implementation and is basically an abstraction around the OS constructs for poll/epoll (on Linux). Remove the feature from the API, but keep the internal implementation for now. This allows providers to move away from wait set support. Note that blocking support and support for native wait objects (e.g. epoll fd's) are still supported by the API. Only the wait set abstraction is removed, which allows providers control over creating wait objects. Signed-off-by: Sean Hefty <[email protected]>
Poll sets are a simple iterator around progressing multiple objects. Remove from the API to reduce provider complexity. Signed-off-by: Sean Hefty <[email protected]>
There's never been an implementation. Signed-off-by: Sean Hefty <[email protected]>
Remove FI_MR_BASIC/SCALABLE/UNSPEC. These were deprecated in version 1.5. Remove FI_LOCAL_MR, which was an earlier version of FI_MR_LOCAL and also deprecated. Signed-off-by: Sean Hefty <[email protected]>
Feature is not implemented natively by providers. Signed-off-by: Sean Hefty <[email protected]>
Remove FI_ORDER_NONE (flag that's 0) and FI_ORDER_STRICT (which isn't a flag, and covers only a portion of the valid flags). Remove FI_ORDER_DATA, which will not be supported by version 2 in order to allow for greater provider optimization. Signed-off-by: Sean Hefty <[email protected]>
Remove the option of directing transmit and receive completions to separate CQs for the same endpoint. The option adds complexity at the provider and application levels. This is largely the result of needing SW based protocols for certain operations, such as tag matching. This either makes it necessary for the app to drive progress across multiple CQs, or the provider emulates the application's CQs in SW. This change updates the man page only. Provider developers are left to update their code bases separately. Signed-off-by: Sean Hefty <[email protected]>
Disallow users hand-crafting or hand-copying their own fi_info structs. This allows the library to allocate hidden fields for internal use. Plus, there's no need to apps to do this, given that the API call is way easier to use. Signed-off-by: Sean Hefty <[email protected]>
Add a new call that takes fi_info as input, which provides consistency with the domain and endpoint open calls. Signed-off-by: Sean Hefty <[email protected]>
Subsystem filtering isn't useful (or likely ever used). Remove the enum fi_log_subsys. Instead convert the API to accept int flags. This maintains API compatibility. Update all current FI_LOG_xxx subsys values to 0. This avoids needing to update the providers to the new API, forcing them to pass in 0 for the flags. No actual flag values are defined. Those become a placeholder for future options. The logging checks are simplified by this change. Signed-off-by: Sean Hefty <[email protected]>
Signed-off-by: Sean Hefty <[email protected]>
Introduce the concept of peer groups. A peer group is a set of peers that are communicating together for some specific set of tasks. Peer groups provide a lower-level mapping of HPC and AI communicators. Signed-off-by: Sean Hefty <[email protected]>
Allow specifying precise tag formatting options. The mem_tag_format takes as input a set of bit fields. In practice, this ends up being unusable to implement, resulting in the entire tag simply being masked with ignore bits. When the mem_tag_format value only has the lower bits set (< 256), interpret the format as specific options. Two new options are defined, one aligned with MPI and the other with CCLs. This information can be used by providers to optimize for the separate use cases. Signed-off-by: Sean Hefty <[email protected]>
@@ -222,7 +222,7 @@ fi_sendmsg. | |||
|
|||
*FI_CLAIM* | |||
: Applies to posted receive operations for endpoints configured | |||
for FI_BUFFERED_RECV or FI_VARIABLE_MSG. This flag is used to | |||
for FI_BUFFERED_RECV. This flag is used to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the FI_VARIABLE_MSG
flag, the FI_BUFFERED_RECV
flag was also removed in this pull request. Therefore, shouldn't the FI_CLAIM
and FI_DISCARD
flags also be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FI_CLAIM and FI_DISCARD can also be used with FI_PEEK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mblocksome and @j-xiong are both correct. However, I agree with Mike. This change is to the fi_msg man page, and FI_PEEK does not apply there. FI_CLAIM and FI_DISCARD can be removed as flags for the fi_msg calls. They remain for fi_tagged.
@@ -234,7 +234,7 @@ fi_sendmsg. | |||
|
|||
*FI_DISCARD* | |||
: Applies to posted receive operations for endpoints configured | |||
for FI_BUFFERED_RECV or FI_VARIABLE_MSG. This flag is used to | |||
for FI_BUFFERED_RECV. This flag is used to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the FI_VARIABLE_MSG
flag, the FI_BUFFERED_RECV
flag was also removed in this pull request. Therefore, shouldn't the FI_CLAIM
and FI_DISCARD
flags also be removed?
@@ -319,11 +319,6 @@ The following flags may be used with fi_trecvmsg. | |||
fi_context structure used for an FI_PEEK + FI_CLAIM operation must be used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove text about fi_recv_context
since buffered receives have been removed?
enum { | ||
FI_TAG_BITS, | ||
FI_TAG_HPC, | ||
FI_TAG_AI, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above should revert back to FI_TAG_MPI and FI_TAG_NCCL, respectively, as the tag formats specifically target the tag definitions for that middleware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FI_TAG_CCL to be more implementation neutral.
Replaced by #10009. |
For discussion and review only. CANNOT BE MERGED!!!