Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file dependency resolution #5

Open
wants to merge 28 commits into
base: coll-select
Choose a base branch
from

Conversation

AlexeyMalkhanov
Copy link

@raffenet I've created one more PR where I removed functions declarations in *_coll_select.h files and moved specific algorithms realization into *_coll_impl.h from *_coll.h

Oblomov, Sergey and others added 9 commits August 11, 2017 09:44
- in some cases fi_trecvmsg(PEEK) returns -NOMSG error
  which should be interpreted as "normal", no error
  state, but it was processed as error.
  added special case to gracefully exit from function
  on NOMSG state

Change-Id: I532d37f536117f3ebfca78cc36227c34a587cddf
Signed-off-by: Ken Raffenetti <[email protected]>
Declare these functions statically at the top of the file. There is no
need to expose further at this time.

Signed-off-by: Yanfei Guo <[email protected]>
Missed this function in [345b175].

Signed-off-by: Ken Raffenetti <[email protected]>
Signed-off-by: Ken Raffenetti <[email protected]>
This is a follow up patch to 3baf0de.

Signed-off-by: Ken Raffenetti <[email protected]>
AlexeyMalkhanov and others added 15 commits August 24, 2017 10:06
To allow lower layers to choose a specific collective implemenation,
expose them as non-static functions. We start with four collectives -
barrier, bcast, reduce and allreduce.

Signed-off-by: Ken Raffenetti <[email protected]>
Extend device and nm/shm-level collective operations to accept a
struct of parameters to facilitate optimal algorithm
selection. Starting with barrier, bcast, reduce and allreduce - each
operation is split into a "select" and "call" phases.

Signed-off-by: Ken Raffenetti <[email protected]>
test/mpi/spawn/spaiccreate2 existed but not listed in testlist.

Signed-off-by: Ken Raffenetti <[email protected]>
Fixes the bug that the startall function do not perform rank-address
translation before calling NM_mpi_isend and NM_mpi_irecv.

Signed-off-by: Ken Raffenetti <[email protected]>
By referring to MPIT_PVARs usage in CH3, adding corresponding parts into
CH4 active message level.

Signed-off-by: Ken Raffenetti <[email protected]>
These settings are outdated for modern networks like 10Gb ethernet. Even
better, operating systems will now dynamically tune these for best
performance, so we can avoid setting them altogether.

Closes pmodels#2635
Signed-off-by: Rob Latham <[email protected]>
Current logic for determining the number of VNIs created may end up
claiming all available contexts in the hardware.  This patch changes
the default number of VNIs to one.  A user may set
MPIR_CVAR_CH4_OFI_MAX_VNIS for more VNIs.

Reviewed-by: Rubasri Kalidas <[email protected]>
Reviewed-by: Chongxiao Cao <[email protected]>
Signed-off-by: Ken Raffenetti <[email protected]>
There was some missing functionality, mismatched variable names, and
outright wrong code in the Solaris thread support header. This should
get things back to a buildable state.

No reviewer.
In order to detect whether find and xargs are supported on the target
platform, autogen.sh runs the find command on the whole MPICH
directory ("."). This is overfill as we only want to detect that those
features exist. On some platforms (e.g., with NFS), this step can take
very a long time. This patch solves this issue by reducing the scope
of find to the ./maint directory.

Signed-off-by: Ken Raffenetti <[email protected]>
The precedence of [] is higher than *. I.e. *bitarray[i] means to first
deref by the index i and the pointer, but needed here is to first deref
the pointer and then the array.

Signed-off-by: Ken Raffenetti <[email protected]>
We need to use FCFLAGS in order to test for "kind-ness" the way the rest
of ROMIO will be built.

Reported-by: Kurt Glaesemann <[email protected]>
Signed-off-by: Ken Raffenetti <[email protected]>
The redscat algorithm in reduce and allreduce needs a load
distribution when count is not a multiplier of pof2. This patch
reduces the imbalance. See pmodels#2637.

Reported-by: Mikhail Kurnosov [email protected]
Signed-off-by: Ken Raffenetti <[email protected]>
pmrsbot pushed a commit that referenced this pull request Jun 23, 2023
mpl/gpu: Implement fast memcpy for ze backend (PR #5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants