-
Notifications
You must be signed in to change notification settings - Fork 886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mca/base: add a new MCA variable type for include lists #12826
base: main
Are you sure you want to change the base?
mca/base: add a new MCA variable type for include lists #12826
Conversation
4f203e3
to
2ed9438
Compare
Hmmm...given the interconnection between the OMPI and PMIx/PRRTE codes, and that they both utilize the MCA architecture, I wonder if we will create confusion if we don't bring this across to the others? There is no connection between MPI_T and those other codes, so if that is the primary motivator here, then perhaps it is less of an issue? Just pondering - if we want to maintain the similarity, then it doesn't seem like it would take much to do the port. |
@rhc54 Just depends on if this is useful. From an external perspective these are just strings like today. Internally the parsing is just handled inside the MCA base to avoid boilerplate code for parsing these. It may still be worth porting over to prrte if that still uses MCA given it centralizes the include/exclude logic. We did this with enums as well-- used to have a lot of code taking an int or string and converting it to a string or int. |
I am cleaning this one up a bit. Hold off on reviews. Adding support for regex include lists since I have a need for that. |
It is not uncommon in Open MPI to register a string variable to allow specifying an exclude or exclude list (every framework registers one). Given this common patten it is worthwhile to formalize an MCA include list variable type: MCA_BASE_VAR_TYPE_INCLUDE_LIST. Variables of this type use a new opal object (mca_base_var_include_list_t) that stores and argv-style array and an exclude- list flag. To register an include list variable the caller must first either OBJ_CONSTRUCT the object or allocate it with OBJ_NEW. Variables of this type are exposed as strings to MPI_T. Signed-off-by: Nathan Hjelm <[email protected]>
2ed9438
to
bb835ff
Compare
Ok, ready to go. |
Hey Nathan - can you provide an example of this being used somewhere in the code base? I'm trying to fully grok what you planned to do with it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is rare to see a PR adding a new feature and such extensive testing, kudos for that. That being said, while I can picture few places where such capability would be nice to have (not critical because we live without it for a long time), this PR seems like an overblown solution.
OBJ_CLASS_INSTANCE(opal_include_list_t, opal_object_t, include_list_contructor, | ||
include_list_destructor); | ||
|
||
static int include_list_match_regex(opal_include_list_t *include_list, const char *value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is doing the opposite of one might expect by reading the function name: it is not finding an item matching a regular expression from a list, but instead it is finding if an item matches one of the regular expressions in the list. At some point these two outcomes can be considered similar but their complexity is drastically different.
Assuming there is a real need for such the capability to match regular expressions stored in a list, this function need to be documented. And maybe renamed include_regex_list_match
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a use case for this. With btl/uct we agreed on an include list of the supported memory domains. Problem is the naming is not necessarily simple. Right now with have mlx5_0 but it will fail to match other HCAs in the system. Same is true for our hardware (somegooglenic[0-9]
). I could just enumerate all the possibilities but having it be a regex makes the include list cleaner and easier to maintain (I can just put somegooglenic.*
in the list).
* ^foo,bar,baz (exclude) | ||
*/ | ||
struct opal_include_list { | ||
opal_serializable_t super; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I question the need for the intermediary opal_serializable_t
class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not strictly necessary but allows the building of additional variable types as needed. I can't think of anything off the top of my head that would benefit but figured it is a good to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general idea is that these variables are strings to the user (command line, MPI_T, env vars, etc) but may be something else inside Open MPI: argv list, struct, etc. The serializable takes the string and fills in the internal structure and vise versa. All of this is handled in mca/var so it works with MPI_T as well.
@rhc54 Best example I have is to modify |
It is not uncommon in Open MPI to register a string variable to allow specifying an exclude or exclude list (every framework registers one). Given this common patten it is worthwhile to formalize an MCA include list variable type: MCA_BASE_VAR_TYPE_INCLUDE_LIST. Variables of this type use a new opal object (mca_base_var_include_list_t) that stores and argv-style array and an exclude- list flag. To register an include list variable the caller must first either OBJ_CONSTRUCT the object or allocate it with OBJ_NEW. Variables of this type are exposed as strings to MPI_T.