-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-16038 pool: Enable pool list for clients #14575
Conversation
Bug-tracker data: |
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/1/execution/node/204/log |
Revive parts of the old client mgmt API to allow libdaos clients to list pools in the system, but only allow a given client to see the pools that it could connect to. Required-githooks: true Change-Id: I2b3c391ddf042b23811be8f3390ec290e92e4290 Signed-off-by: Michael MacDonald <[email protected]>
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/2/execution/node/204/log |
Tagging early reviewers to validate the approach before I proceed:
There is still some work to be done to enable an optional pool query for each listed pool, for parity with the dmg command. I also plan to add ftest coverage for the new For the moment, I've got this work based on the google/2.4 branch, but I will plan to get it rebased on master for 2.8, and possibly backported for 2.6.1 if there are no compatibility issues. |
Some example output, showing that dmg can list all pools, since it is an admin command, but the daos tool can only list pools that would allow the user to connect:
The pool access filtering is done on the engine side, as the client environment is untrusted, and clients without connection access to a pool shouldn't know anything about its details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked closely at the credential usage and skimmed the rest. This approach looks good to me.
Since both |
@liw: Yep, I did consider that approach, but please note that for each pool in the system, we need to perform an ACL check using the client's credential. If the client could connect to that pool, it's included in the list returned to the client. The logic for performing this access check is implemented in the engine, so if we used the agent as a relay, we would have to go (repeating the MS/PS calls for each pool via dRPC from daos_server): The approach proposed in this patch looks like this (repeating the PS call for each pool from the MS rank): Given that the majority of the work that needs to be done for this functionality happens in the engine, it seemed much more efficient to keep most of the logic at this level. Maybe I'm missing something, though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach looks go to me.
Right, the "constraint of the design"... Is that an external requirement for the solution or one that is required by the current approach? (One reason I can think of for that to be an external requirement is: We might not want a user to see the labels of the pools that he cannot access.) Just want to make sure this requirement is a must have. Suppose we must filter out pools a user cannot access, would it be nice if we could consider this type of listing operation to be an extension of |
One thing to keep in mind is that all ACL processing code is currently on the server side. We could change this and make it common code (which I would recommend over trying to re-implement this logic in Go). In that case we may need to expose it via libdaos to make it accessible to the agent. |
This is absolutely a must-have requirement. Users must not have any visibility into or knowledge about pools to which they do not have access. The pool label may reveal sensitive information about the contents or use of the pool, for example. As we cannot trust anything on the client side of the client API, we can't do the filtering in libdaos. In order to accomplish this filtering in the control plane only, I think the following would need to happen:
Pros: Cons: |
Regardless of exactly how we implement the backend side of the pool list capability, I think the front-end work can proceed. Before I go further down that road, however, I would appreciate some feedback from @mchaarawi on the API changes ... I'd prefer to avoid finding out that it's a no-go or that there was a better way to do it after I've sunk more time into it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no concerns; but just needs some required small updates for the task stuff
{dc_deprecated, 0}, | ||
{dc_mgmt_pool_list, sizeof(daos_mgmt_pool_list_t)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you really need to add a user task API for this function though? I'm not really sure this is required and you can probably not worry about changes in the task stuff.
otherwise changes here are not enough and you would need to add a few more things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. I was following the pattern set by @kccain when he added the original implementation of pool list. I haven't really dug into it beyond that.
Given that I don't feel strongly either way, I'm happy to follow your guidance on the best way to implement this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mchaarawi: Ping... Please advise on the recommended approach here. I'm getting ready to rebase all of this work on master, and would prefer to push that PR in as close to a final state as possible rather than iterating on it.
Also, I don't see much in the way of client API tests. Not particularly keen to invent all of that myself, but happy to work with someone who knows more about this side of things. I am planning to add ftests for this feature -- would that be acceptable in lieu of new client API unit tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry i didn't know you were still looking for my review on this. as i mentioned, you should just avoid adding this change with the task api, otherwise you still need to update other structures to support that.
for client api tests, im not sure what you mean. i believe you should add some tests there:
https://github.com/daos-stack/daos/blob/master/src/tests/suite/daos_mgmt.c
but if you want to use ftest, that is fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry i didn't know you were still looking for my review on this. as i mentioned, you should just avoid adding this change with the task api, otherwise you still need to update other structures to support that.
OK, is the task API deprecated? I have a superficial understanding of the client API's workings, so as I said I was just following the existing patterns. If I didn't use the task API, would I basically just rework daos_mgmt_list_pools()
to just call dc_mgmt_pool_list()
directly? Is there any downside to doing it this way?
for client api tests, im not sure what you mean. i believe you should add some tests there: https://github.com/daos-stack/daos/blob/master/src/tests/suite/daos_mgmt.c but if you want to use ftest, that is fine too.
Ah, OK. I was thinking of unit tests under src/client/api/tests
. I'll just rework the suite test to use the re-added client API instead of the dmg helper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, is the task API deprecated? I have a superficial understanding of the client API's workings, so as I said I was just following the existing patterns.
no the task api is not deprecated. but to add the task API entry for this function, you still need to update:
https://github.com/daos-stack/daos/blob/mjmac/DAOS-15982/src/include/daos/task.h
to be safe.
otherwise you can just remove the task api entry for it, and do as you suggested:
If I didn't use the task API, would I basically just rework
daos_mgmt_list_pools()
to just calldc_mgmt_pool_list()
directly? Is there any downside to doing it this way?
either way is fine for me.
/** Pool label */ | ||
d_string_t mgpi_label; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to say that this is an API breaking change. but it sounds like we only used this in dmg functions, so probably no user was using this before.
int | ||
daos_mgmt_list_pools(const char *group, daos_size_t *npools, daos_mgmt_pool_info_t *pools, | ||
daos_event_t *ev); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no API description. so should add that.
I assume one can query the size with NULL pools buf, or provide an estimated buffer of certain size and the call return the actual size in npools?
this also needs some C tests added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no API description. so should add that.
Done.
I assume one can query the size with NULL pools buf, or provide an estimated buffer of certain size and the call return the actual size in npools?
Correct. This is following the pattern established by other pool client APIs.
this also needs some C tests added.
Ack. Will add them.
Signed-off-by: Michael MacDonald <[email protected]>
Change-Id: Ia59d1464870ee55f65a4fc3a37fe826ce1630669
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/3/execution/node/204/log |
After some discussion with @liw and @kccain (thanks!), I'm going to explore the idea of doing the pool access check in the list handler instead of making a call with the new "can this client access the pool" RPC. It would still require N RPCs to get the pool ACLs, but wouldn't require a new server<->server RPC that is probably only useful for this specific scenario. One potential downside to this approach is that it would lose the ability to delegate the connectability decision to the pool service. In scenarios where a pool is queryable for its properties but would not allow clients to connect (or, indeed, wouldn't allow a specific client to connect even if the ACL allows it), the client would receive that pool in the list of connectable pools. This may be fine as an edge case. |
Signed-off-by: Michael MacDonald <[email protected]>
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/4/execution/node/204/log |
Signed-off-by: Michael MacDonald <[email protected]>
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/10/execution/node/176/log |
Not offhand... I included that thought in case it seemed relevant to someone else. I think it's an acceptable edge case. The user-visible result of this would be a pool that they can't actually access, which would be annoying but not the end of the world, IMO.
Not when the pool query is being done from a server. There are server<->server RPCs that don't require a pool handle. |
Test stage Functional on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14575/10/testReport/ |
Quick-functional: true Test-tag: ListClientPoolsTest Signed-off-by: Michael MacDonald <[email protected]>
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/11/execution/node/204/log |
Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/11/execution/node/1270/log |
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/12/execution/node/204/log |
Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/12/execution/node/1270/log |
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/13/execution/node/204/log |
Test stage Functional on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14575/13/testReport/ |
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/14/execution/node/204/log |
Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/14/execution/node/1270/log |
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/15/execution/node/147/log |
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/16/execution/node/147/log |
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/17/execution/node/205/log |
Skip-NLT: true Skip-unit-test: true Quick-functional: true Test-tag: test_list_pools Required-githooks: true Change-Id: I829dd6e5d26ea8ff3f5c9ff0da758dcc92e90774 Signed-off-by: Michael MacDonald <[email protected]>
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14575/18/execution/node/147/log |
Alright, I think I've addressed all of the preliminary feedback and I've finished iterating on tests. Moving over to #14672 for the master PR. |
Revive parts of the old client mgmt API to allow
libdaos clients to list pools in the system, but
only allow a given client to see the pools that
it could connect to.