Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for MVAPICH-PLUS support for MCR-DL #14

Merged
merged 5 commits into from
Feb 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__pycache__
mcr_dl/git_version_info_installed.py
mcr_dl.egg-info/
mcr_dl/config.yml
mcr_dl/build_config.yml
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ python setup.py install
```

### Update Configurations
Update mpi, cuda, and nccl paths appropriately in [mcr_dl/config.yml](/mcr_dl/config.yml)
Update mpi, cuda, and nccl paths appropriately in [mcr_dl/config.yml](/mcr_dl/build_config.yml)

### The MCR-DL Communication Benchmarking Suite

Expand Down
11 changes: 0 additions & 11 deletions mcr_dl/config.yml

This file was deleted.

3 changes: 1 addition & 2 deletions mcr_dl/ops/op_builder/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,7 @@

class ConfigPath():
def __init__(self, file_path = None):
self.file_path = os.path.join(os.path.dirname(mcr_dl.__file__), "config.yml") if file_path is None else file_path
print(self.file_path)
self.file_path = os.path.join(os.path.dirname(mcr_dl.__file__), "build_config.yml") if file_path is None else file_path
self.config_data = self.load_config()
self.mpi_path = self.config_data.get("mpi", {}).get("path")
self.mpi_include = self.config_data.get("mpi", {}).get("include")
Expand Down
6 changes: 3 additions & 3 deletions mcr_dl/utils/dist.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,12 +156,12 @@ def set_mpi_dist_environemnt(master_addr = None):
if master_addr is not None:
os.environ['MASTER_ADDR'] = master_addr
local_rank = env2int(
['LOCAL_RANK', 'MPI_LOCALRANKID', 'OMPI_COMM_WORLD_LOCAL_RANK', 'MV2_COMM_WORLD_LOCAL_RANK', 'SLURM_LOCALID'])
['LOCAL_RANK', 'MPI_LOCALRANKID', 'OMPI_COMM_WORLD_LOCAL_RANK', 'MV2_COMM_WORLD_LOCAL_RANK', 'SLURM_LOCALID', 'MVP_COMM_WORLD_LOCAL_RANK'])
if 'LOCAL_RANK' not in os.environ:
os.environ['LOCAL_RANK'] = str(local_rank)
rank = env2int(['RANK', 'MPI_RANKID', 'OMPI_COMM_WORLD_RANK', 'MV2_COMM_WORLD_RANK', 'SLURM_PROCID'])
rank = env2int(['RANK', 'MPI_RANKID', 'OMPI_COMM_WORLD_RANK', 'MV2_COMM_WORLD_RANK', 'SLURM_PROCID', 'MVP_COMM_WORLD_LOCAL_RANK'])
if 'RANK' not in os.environ:
os.environ['RANK'] = str(rank)
world_size = env2int(['WORLD_SIZE', 'OMPI_COMM_WORLD_SIZE', 'MV2_COMM_WORLD_SIZE', 'SLURM_NPROCS'])
world_size = env2int(['WORLD_SIZE', 'OMPI_COMM_WORLD_SIZE', 'MV2_COMM_WORLD_SIZE', 'SLURM_NPROCS', 'MVP_COMM_WORLD_LOCAL_RANK'])
if 'WORLD_SIZE' not in os.environ:
os.environ['WORLD_SIZE'] = str(world_size)
Loading