Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCP/WIREUP: Add flush+destroy of UCT EPs used by TMP EP lanes #5613

Closed
wants to merge 9 commits into from

Conversation

dmitrygx
Copy link
Member

What

Add flush+destroy of UCT EPs used by TMP EP lanes.

Why ?

Will be used by #5582 to flush+destroy of UCT EPs used only by TMP EP in order to correctly send data that has to be transferred before closing the UCT EP.
Depends on #5612.

How ?

  1. When WIREUP EP needs to be closed (flush_flag=CANCEL) or replaced by real UCT EP (flush_flag=LOCAL), schedule flush+destroy of all UCT EP lanes used by TMP EP.
  2. Just destroy lanes that are used by the main UCP EP to NULL in TMP EP lanes in order to not flush them.
  3. When flush+destroy process completed, call complete_cb if it was specified by the caller (e.g. it allows continue ucp_wireup_ep_progress()).

@dmitrygx dmitrygx force-pushed the topic/ucp/wireup_tmp_ep branch from a9319ee to 19b7064 Compare August 25, 2020 07:33
@dmitrygx dmitrygx added the WIP-DNM Work in progress / Do not review label Aug 25, 2020
@dmitrygx
Copy link
Member Author

waiting #5612

@dmitrygx dmitrygx removed the WIP-DNM Work in progress / Do not review label Aug 25, 2020
@dmitrygx dmitrygx force-pushed the topic/ucp/wireup_tmp_ep branch from 19b7064 to 104057c Compare August 25, 2020 09:33
@dmitrygx
Copy link
Member Author

@brminich @evgeny-leksikov could you review pls?

@dmitrygx dmitrygx force-pushed the topic/ucp/wireup_tmp_ep branch 4 times, most recently from 6bf6ea4 to c860575 Compare August 25, 2020 18:39
@dmitrygx dmitrygx force-pushed the topic/ucp/wireup_tmp_ep branch from c860575 to 7de5445 Compare August 25, 2020 18:57
@dmitrygx
Copy link
Member Author

dmitrygx commented Aug 27, 2020

@alinask @brminich @evgeny-leksikov could you review pls?

src/ucp/wireup/wireup.c Outdated Show resolved Hide resolved
ucp_wireup_ep_t *wireup_ep = (ucp_wireup_ep_t*)user_data;

if (wireup_ep == NULL) {
/* check for NULL pointer to workaround Coverity warning (it wrongly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would assert help here instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, will try ucs_assert_always()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ucs_assert() should be enough to suppress coverity

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ucs_assert() should be enough to suppress coverity

yes, but I guess it's OK to have always, since it's not a critical path

ucs_assert(tmp_ep != ep);

/* to prevent flush+destroy UCT EPs that are used by the main EP,
* they have to be remove from the TMP EP lanes and their WIREUP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

src/ucp/wireup/wireup_ep.c Show resolved Hide resolved
@dmitrygx
Copy link
Member Author

@brminich @evgeny-leksikov your comments were addressed, could you review again pls?


for (another_lane = 0; another_lane < another_ep_config_key->num_lanes;
++another_lane) {
if (ucp_ep_config_lane_is_equal(ep_config_key,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • this condition has the same result for any another_lane value.
  • whole ucp_wireup_ep_lane_used_by_another_ep_config func looks confusing to me

so since major change is based on this func it should be revised

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, fixed

Comment on lines 1023 to 1027
return (key1->lanes[lane1].rsc_index == key2->lanes[lane2].rsc_index) &&
(key1->lanes[lane1].proxy_lane == key2->lanes[lane2].proxy_lane) &&
(key1->lanes[lane1].dst_md_index == key2->lanes[lane2].dst_md_index) &&
(key1->lanes[lane1].path_index == key2->lanes[lane2].path_index) &&
((key1->lanes[lane1].lane_types == key2->lanes[lane2].lane_types) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memcmp? or even (key1->lanes[lane1] == key2->lanes[lane2]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, wouldn't work since we don't want compare lane_tpyes if it wasn't requested

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, missed extra () around lane_types comparison, better don't align different block, maybe even separate by extra line to improve readability?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or

if (compare_types) {
    return key1->lanes[lane1] == key2->lanes[lane2];
} else {
    return (key1->lanes[lane].rsc_index    == key2->lanes[lane2].rsc_index)    &&
           (key1->lanes[lane].proxy_lane   == key2->lanes[lane2].proxy_lane)   &&
           (key1->lanes[lane].dst_md_index == key2->lanes[lane2].dst_md_index) &&
           (key1->lanes[lane].path_index   == key2->lanes[lane2].path_index);
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

src/ucp/core/ucp_ep.c Outdated Show resolved Hide resolved
src/ucp/wireup/wireup.c Outdated Show resolved Hide resolved
++another_lane) {
if (ucp_ep_config_lane_is_equal(ep_config_key,
another_ep_config_key,
lane, another_lane, 0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, why don't we need to compare types?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the can be different, we need to check whether this is the same TL+device pair or not


/* check for NULL pointer to workaround Coverity warning (it wrongly
* assumes that this callback could be called upon GET/PUT operation) */
ucs_assertv_always(wireup_ep == NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!=?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, sure
it is currently dead code, since all lanes in UCP EP are identical to TMP EP's one
so, UCP EP is responsible for them, and TMP EP don't' flush and delete them

src/ucp/wireup/wireup_ep.c Show resolved Hide resolved
(key1->lanes[lane].path_index == key2->lanes[lane].path_index) &&
((key1->lanes[lane].lane_types == key2->lanes[lane].lane_types) ||
return (key1->lanes[lane1].rsc_index == key2->lanes[lane2].rsc_index) &&
(key1->lanes[lane1].proxy_lane == key2->lanes[lane2].proxy_lane) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comparing lane numbers from different configs is erroneous, also does not seem to be needed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ye, fixed
no need this

@dmitrygx dmitrygx force-pushed the topic/ucp/wireup_tmp_ep branch from aec5a88 to 4e1c626 Compare August 31, 2020 18:48
@dmitrygx
Copy link
Member Author

dmitrygx commented Sep 1, 2020

bot:pipe:retest

1 similar comment
@dmitrygx
Copy link
Member Author

dmitrygx commented Sep 1, 2020

bot:pipe:retest

@dmitrygx
Copy link
Member Author

dmitrygx commented Sep 1, 2020

@brminich @evgeny-leksikov @yosefe could you review pls?

ucp_lane_index_t lane1,
ucp_lane_index_t lane2)
{
return !memcmp(&key1->lanes[lane1], &key2->lanes[lane2],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can padding be different?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, better not to do this

Comment on lines 495 to 499
ucs_assertv_always(dst_dev_index == lane_desc->dst_dev_index,
"lane[%d].dst_md_index=%d, dst_md_index=%d",
lane, lane_desc->dst_md_index, dst_md_index);
ucs_assertv_always(dst_md_index == lane_desc->dst_md_index,
"lane[%d].dst_dev_index=%d, dst_dev_index=%d",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

traces are incorrect, you mention dst_dev for dst_md and vice versa

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 1017 to 1020
int ucp_ep_config_lane_tl_is_equal(const ucp_ep_config_key_t *key1,
const ucp_ep_config_key_t *key2,
ucp_lane_index_t lane1,
ucp_lane_index_t lane2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: maybe reorder args like key1, lane1, key2, lane2 here and below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -450,7 +451,7 @@ struct ucp_wireup_sockaddr_data {
uint8_t addr_mode; /**< The attached address format
defined by
UCP_WIREUP_SA_DATA_xx */
uint8_t dev_index; /**< Device address index used to
ucp_rsc_index_t dev_index; /**< Device address index used to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls use fixed sized types for wire protocols

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, fixed

Comment on lines 590 to 593
dst_md_index = select_params->address->address_list
[select_info->addr_index].md_index;
dst_dev_index = select_params->address->address_list
[select_info->addr_index].dev_index;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: can create local pointer to select_params->address->address_list or even its entry

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 296 to 297
ucp_wireup_ep_configs_use_same_lane(ucp_ep_config_key_t *ep_config_key,
ucp_ep_config_key_t *another_ep_config_key,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • just key1 and key2 like ucp_ep_config_lane_tl_is_equal
  • and ucp_wireup_ep_configs_use_same_tl_lane
    to be consistent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

ucp_ep_config_key_t *another_ep_config_key,
ucp_lane_index_t lane)
{
ucp_lane_index_t another_lane;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lane_idx shorter and typically used

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

if (self->tmp_ep != NULL) {
ucs_assert(!(self->tmp_ep->flags & UCP_EP_FLAG_USED));
ucp_ep_disconnected(self->tmp_ep, 1);
ucp_wireup_tmp_ep_destroy(ucp_ep, self, UCT_FLUSH_FLAG_CANCEL, NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why ret value is not checked here? request can leak here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a TODO that it will be replaced by ucp_worker_discard_uct_ep() when 5608 gets merged

src/ucp/wireup/wireup_ep.h Show resolved Hide resolved
@dmitrygx dmitrygx force-pushed the topic/ucp/wireup_tmp_ep branch from 9f9bcdd to 9df2346 Compare September 2, 2020 05:55
@dmitrygx
Copy link
Member Author

dmitrygx commented Sep 2, 2020

@brminich @yosefe could you review pls?

@brminich
Copy link
Contributor

brminich commented Sep 2, 2020

@dmitrygx, plz fix merge conflicts

@dmitrygx
Copy link
Member Author

dmitrygx commented Sep 2, 2020

@dmitrygx, plz fix merge conflicts

@brminich done

int ucp_ep_config_lane_is_equal(const ucp_ep_config_key_t *key1,
const ucp_ep_config_key_t *key2,
ucp_lane_index_t lane, int compare_types)
int ucp_ep_config_lane_tl_is_equal(const ucp_ep_config_key_t *key1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ucp_ep_config_lane_is_same_peer

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 1029 to 1030
ucp_lane_index_t lane1,
ucp_lane_index_t lane2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need only one lane

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -292,6 +292,22 @@ ucp_wireup_find_remote_p2p_addr(ucp_ep_h ep, ucp_lane_index_t remote_lane,
return UCS_ERR_UNREACHABLE;
}

ucp_lane_index_t
ucp_wireup_ep_configs_use_same_tl_lane(ucp_ep_config_key_t *key1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ucp_wireup_ep_configs_can_reuse_lane

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


wireup_ep->tmp_ep = NULL;

req = ucp_ep_flush_internal(tmp_ep, ep_flush_flags, 0, &param, NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need flush on tmp_ep? can't we just ep_discard all the lanes we don't need?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they need to be converted from WIREUP EP to UCT EP
I'll use @alinask implementation for destroying TMP EP and will add ep_discard there

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but need to merge ep_discard first

@dmitrygx dmitrygx added the WIP-DNM Work in progress / Do not review label Sep 2, 2020
@dmitrygx dmitrygx closed this Sep 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WIP-DNM Work in progress / Do not review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants