Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable GPU-aware MPI by default #4318

Open
wants to merge 5 commits into
base: development
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 16 additions & 15 deletions Docs/sphinx_documentation/source/GPU.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1643,7 +1643,7 @@ Finally, the parallel communication of particle data has been ported and optimiz
platforms. This includes :cpp:`Redistribute()`, which moves particles back to the proper grids after their positions
have changed, as well as :cpp:`fillNeighbors()` and :cpp:`updateNeighbors()`, which are used to exchange halo particles.
As with :cpp:`MultiFab` data, these have been designed to minimize host / device traffic as much as possible, and can
take advantage of the Cuda-aware MPI implementations available on platforms such as ORNL's Summit.
take advantage of the GPU-aware MPI implementations available on platforms such as ORNL's Frontier.


Profiling with GPUs
Expand Down Expand Up @@ -1742,17 +1742,18 @@ Inputs Parameters
The following inputs parameters control the behavior of amrex when running on GPUs. They should be prefaced
by "amrex" in your :cpp:`inputs` file.

+----------------------------+-----------------------------------------------------------------------+-------------+----------+
| | Description | Type | Default |
+============================+=======================================================================+=============+==========+
| use_gpu_aware_mpi | Whether to use GPU memory for communication buffers during MPI calls. | Bool | 0 |
| | If true, the buffers will use device memory. If false (i.e., 0), they | | |
| | will use pinned memory. In practice, we find it is not always worth | | |
| | it to use GPU aware MPI. | | |
+----------------------------+-----------------------------------------------------------------------+-------------+----------+
| abort_on_out_of_gpu_memory | If the size of free memory on the GPU is less than the size of a | Bool | 0 |
| | requested allocation, AMReX will call AMReX::Abort() with an error | | |
| | describing how much free memory there is and what was requested. | | |
+----------------------------+-----------------------------------------------------------------------+-------------+----------+
| the_arena_is_managed | Whether :cpp:`The_Arena()` allocates managed memory. | Bool | 0 |
+----------------------------+-----------------------------------------------------------------------+-------------+----------+
+----------------------------+-----------------------------------------------------------------------+-------------+----------------+
| | Description | Type | Default |
+============================+=======================================================================+=============+================+
| use_gpu_aware_mpi | Whether to use GPU memory for communication buffers during MPI calls. | Bool | MPI-dependent |
| | If true, the buffers will use device memory. If false (i.e., 0), they | | |
| | will use pinned memory. It will be activated if AMReX detects that | | |
| | GPU-aware MPI is supported by the MPI library (MPICH, OpenMPI, and | | |
| | derivative implementations). | | |
+----------------------------+-----------------------------------------------------------------------+-------------+----------------+
| abort_on_out_of_gpu_memory | If the size of free memory on the GPU is less than the size of a | Bool | 0 |
| | requested allocation, AMReX will call AMReX::Abort() with an error | | |
| | describing how much free memory there is and what was requested. | | |
+----------------------------+-----------------------------------------------------------------------+-------------+----------------+
| the_arena_is_managed | Whether :cpp:`The_Arena()` allocates managed memory. | Bool | 0 |
+----------------------------+-----------------------------------------------------------------------+-------------+----------------+
31 changes: 31 additions & 0 deletions Src/Base/AMReX_ParallelDescriptor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@

#ifdef BL_USE_MPI
#include <AMReX_ccse-mpi.H>
#if __has_include(<mpi-ext.h>) && defined(OPEN_MPI)
# include <mpi-ext.h>
#endif
#endif

#ifdef AMREX_PMI
Expand Down Expand Up @@ -1510,6 +1513,34 @@ ReadAndBcastFile (const std::string& filename, Vector<char>& charBuf,
void
Initialize ()
{
#if defined(AMREX_USE_CUDA)

#if (defined(OMPI_HAVE_MPI_EXT_CUDA) && OMPI_HAVE_MPI_EXT_CUDA) || (defined(MPICH) && defined(MPIX_GPU_SUPPORT_CUDA))
use_gpu_aware_mpi = (bool) MPIX_Query_cuda_support();
#endif

#elif defined(AMREX_USE_HIP)

#if defined(OMPI_HAVE_MPI_EXT_ROCM) && OMPI_HAVE_MPI_EXT_ROCM
use_gpu_aware_mpi = (bool) MPIX_Query_rocm_support();
#elif defined(MPICH) && defined(MPIX_GPU_SUPPORT_HIP)
int is_supported = 0;
if (MPIX_GPU_query_support(MPIX_GPU_SUPPORT_HIP, &is_supported) == MPI_SUCCESS) {
use_gpu_aware_mpi = (bool) is_supported;
}
#endif

#elif defined(AMREX_USE_SYCL)

#if defined(MPICH) && defined(MPIX_GPU_SUPPORT_ZE)
int is_supported = 0;
if (MPIX_GPU_query_support(MPIX_GPU_SUPPORT_ZE, &is_supported) == MPI_SUCCESS) {
use_gpu_aware_mpi = (bool) is_supported;
}
#endif

#endif

#ifndef BL_AMRPROF
ParmParse pp("amrex");
pp.queryAdd("use_gpu_aware_mpi", use_gpu_aware_mpi);
Expand Down
Loading