Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

features/branson #462

Open
wants to merge 43 commits into
base: develop
Choose a base branch
from
Open
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
ec5bd68
branson fix for lassen
Dec 1, 2024
31fb7d0
branson experiment fixes
Dec 1, 2024
e55474b
Branson experiment.py
Dec 2, 2024
3a9d12b
lint
Dec 2, 2024
d6a2140
Merge remote-tracking branch 'origin/develop' into features/branson
Dec 9, 2024
9eceb18
Merge branch 'develop' into features/branson
rfhaque Dec 16, 2024
a5bfa33
Merge branch 'develop' into features/branson
slabasan Dec 18, 2024
57d725b
Merge remote-tracking branch 'origin/develop' into features/branson
Jan 3, 2025
249120f
Merge remote-tracking branch 'origin/develop' into features/branson
Jan 10, 2025
1d20884
Merge remote-tracking branch 'origin/develop' into features/branson
Jan 11, 2025
8e232cf
branson hip cuda implementation
Jan 13, 2025
03f28fd
Fix input file params
Jan 13, 2025
f1d763f
lint
Jan 13, 2025
1783331
venado system specs
rfhaque Jan 15, 2025
2a261f7
Fix NVCC flags
rfhaque Jan 16, 2025
6d0a753
Merge remote-tracking branch 'origin/develop' into systems/venado
Jan 16, 2025
752bae8
Fix caliper version
Jan 16, 2025
e708c94
Add patches
Jan 16, 2025
b6c342d
Fix caliper, slurm issues
rfhaque Jan 16, 2025
4405a94
Merge remote-tracking branch 'origin/develop' into features/branson
rfhaque Jan 16, 2025
49b3fb1
Merge branch 'systems/venado' into features/branson
rfhaque Jan 16, 2025
c5bd894
Merge remote-tracking branch 'origin/develop' into features/branson
Jan 16, 2025
41295fc
Fix Cmake flags
Jan 16, 2025
8a4e9bd
Merge branch 'features/branson' of github.com:LLNL/benchpark into fea…
rfhaque Jan 16, 2025
682c049
Fixes
rfhaque Jan 16, 2025
cb8a309
Remove saxpy
Jan 16, 2025
d4d2f12
lint, license
Jan 16, 2025
eb64782
Pull from venado specs
Jan 16, 2025
f6cdf49
venado specs
Jan 17, 2025
1084379
gtl
rfhaque Jan 17, 2025
c407bfc
Merge remote-tracking branch 'origin/develop' into features/branson
Jan 21, 2025
ae93d54
el cap compiler dependency
Jan 21, 2025
c253374
Merge remote-tracking branch 'origin/develop' into features/branson
Jan 22, 2025
613133f
Update venado specs
Jan 22, 2025
1829125
Merge remote-tracking branch 'origin/develop' into features/branson
rfhaque Jan 27, 2025
8cab26f
Merge remote-tracking branch 'origin/develop' into features/branson
Jan 30, 2025
3092aae
Merge remote-tracking branch 'origin/develop' into features/branson
rfhaque Feb 5, 2025
e5bfab8
Merge remote-tracking branch 'origin/develop' into features/branson
Feb 6, 2025
4374ce2
Fix caliper import
Feb 6, 2025
412f0d7
Update CMakeLists.txt
Feb 7, 2025
ebd48c2
Add metis and viz variants
Feb 7, 2025
84188d0
Add dry runs
Feb 7, 2025
dd241a7
Merge remote-tracking branch 'origin/develop' into features/branson
rfhaque Feb 20, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions .github/workflows/run.yml
Original file line number Diff line number Diff line change
@@ -844,3 +844,39 @@ jobs:
benchmark_spec: ad
system_name: tioga
system_spec: llnl-elcapitan rocm=6.2.4 compiler=rocmcc

- name: branson/openmp caliper=mpi,time ruby llnl-cluster cluster=ruby compiler=intel
uses: ./.github/actions/dynamic-dry-run
with:
benchmark_name: branson
benchmark_mode: openmp
benchmark_spec: branson+openmp caliper=mpi,time
system_name: ruby
system_spec: llnl-cluster cluster=ruby compiler=intel

- name: branson/cuda caliper=cuda,time lassen llnl-sierra cuda=11-8-0 compiler=clang-ibm
uses: ./.github/actions/dynamic-dry-run
with:
benchmark_name: branson
benchmark_mode: cuda
benchmark_spec: branson+cuda caliper=cuda,time
system_name: lassen
system_spec: llnl-sierra cuda=11-8-0 compiler=clang-ibm

- name: branson/cuda caliper=cuda,time venado lanl-venado cuda=12.5 compiler=cce +gtl
uses: ./.github/actions/dynamic-dry-run
with:
benchmark_name: branson
benchmark_mode: cuda
benchmark_spec: branson+cuda caliper=cuda,time
system_name: venado
system_spec: lanl-venado cuda=12.5 compiler=cce +gtl

- name: branson/rocm caliper=mpi,time tioga llnl-elcapitan rocm=6.2.4 compiler=cce +gtl
uses: ./.github/actions/dynamic-dry-run
with:
benchmark_name: branson
benchmark_mode: rocm
benchmark_spec: branson+rocm caliper=mpi,time
system_name: tioga
system_spec: llnl-elcapitan rocm=6.2.4 compiler=cce +gtl
125 changes: 125 additions & 0 deletions experiments/branson/experiment.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Copyright 2023 Lawrence Livermore National Security, LLC and other
# Benchpark Project Developers. See the top-level COPYRIGHT file for details.
#
# SPDX-License-Identifier: Apache-2.0

from benchpark.error import BenchparkError
from benchpark.directives import variant
from benchpark.experiment import Experiment
from benchpark.openmp import OpenMPExperiment
from benchpark.cuda import CudaExperiment
from benchpark.rocm import ROCmExperiment
from benchpark.scaling import StrongScaling
from benchpark.scaling import WeakScaling
from benchpark.caliper import Caliper


class Branson(
Experiment,
OpenMPExperiment,
CudaExperiment,
ROCmExperiment,
StrongScaling,
WeakScaling,
Caliper,
):
variant(
"workload",
default="branson",
description="workload name",
)

variant(
"version",
default="develop",
description="app version",
)

variant(
"n_groups",
default="30",
values=int,
description="Number of groups",
)

def compute_applications_section(self):
# TODO: Replace with conflicts clause
scaling_modes = {
"strong": self.spec.satisfies("+strong"),
"weak": self.spec.satisfies("+weak"),
"single_node": self.spec.satisfies("+single_node"),
}

scaling_mode_enabled = [key for key, value in scaling_modes.items() if value]
if len(scaling_mode_enabled) != 1:
raise BenchparkError(
f"Only one type of scaling per experiment is allowed for application package {self.name}"
)

# Number of processes in each dimension
num_nodes = {"n_nodes": 1}

# Per-process size (in zones) in each dimension
num_particles = {"num_particles": 850000000}

if self.spec.satisfies("+single_node"):
for pk, pv in num_nodes.items():
self.add_experiment_variable(pk, pv, True)
for nk, nv in num_particles.items():
self.add_experiment_variable(nk, nv, True)
elif self.spec.satisfies("+strong"):
scaled_variables = self.generate_strong_scaling_params(
{tuple(num_nodes.keys()): list(num_nodes.values())},
int(self.spec.variants["scaling-factor"][0]),
int(self.spec.variants["scaling-iterations"][0]),
)
for pk, pv in scaled_variables.items():
self.add_experiment_variable(pk, pv, True)
for nk, nv in num_particles.items():
self.add_experiment_variable(nk, nv, True)
elif self.spec.satisfies("+weak"):
scaled_variables = self.generate_weak_scaling_params(
{tuple(num_nodes.keys()): list(num_nodes.values())},
{tuple(num_particles.keys()): list(num_particles.values())},
int(self.spec.variants["scaling-factor"][0]),
int(self.spec.variants["scaling-iterations"][0]),
)
for k, v in scaled_variables.items():
self.add_experiment_variable(k, v, True)

self.add_experiment_variable(
"use_gpu",
(
"TRUE"
if self.spec.satisfies("+cuda") or self.spec.satisfies("+rocm")
else "FALSE"
),
)

self.add_experiment_variable("n_ranks", "{n_nodes}*{sys_cores_per_node}", True)

def compute_spack_section(self):
# get package version
app_version = self.spec.variants["version"][0]

# get system config options
# TODO: Get compiler/mpi/package handles directly from system.py
system_specs = {}
system_specs["compiler"] = "default-compiler"
system_specs["mpi"] = "default-mpi"
if self.spec.satisfies("+cuda"):
system_specs["cuda_version"] = "{default_cuda_version}"
system_specs["cuda_arch"] = "{cuda_arch}"
if self.spec.satisfies("+rocm"):
system_specs["rocm_arch"] = "{rocm_arch}"

# set package spack specs
self.add_spack_spec(system_specs["mpi"])

self.add_spack_spec(
self.name,
[
f"branson@{app_version} +metis n_groups={self.spec.variants['n_groups'][0]} ",
system_specs["compiler"],
],
)
51 changes: 51 additions & 0 deletions legacy/experiments/branson/mpi-only/ramble.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
ramble:
applications:
branson:
workloads:
branson:
experiments:
branson_branson_weak_scaling_caliper_time_mpi_{n_nodes}_{num_particles}_{n_ranks}:
exclude: {}
matrix: []
variables:
n_nodes:
- 1
- 2
- 4
- 8
n_ranks: '{n_nodes}*{sys_cores_per_node}'
num_particles:
- 850000000
- 1700000000
- 3400000000
- 6800000000
variants:
package_manager: spack
zips: {}
config:
deprecated: true
spack_flags:
concretize: -U -f
install: --add --keep-stage
include:
- ./configs
modifiers:
- name: allocation
- mode: mpi
name: caliper
- mode: time
name: caliper
software:
environments:
branson:
packages:
- caliper
- default-mpi
- branson
packages:
branson:
compiler: default-compiler
pkg_spec: branson@develop+caliper
caliper:
compiler: default-compiler
pkg_spec: caliper@master+adiak+mpi~libunwind~libdw~papi
49 changes: 0 additions & 49 deletions legacy/experiments/branson/openmp/ramble.yaml

This file was deleted.

5 changes: 3 additions & 2 deletions repo/branson/application.py
Original file line number Diff line number Diff line change
@@ -19,14 +19,15 @@ class Branson(ExecutableApplication):
executable('setup_experiment',
template=[
'cp {branson}/inputs/* {experiment_run_dir}/.',
'sed -i "s|<photons>250000000</photons>|<photons>{num_particles}</photons>|g" {experiment_run_dir}/{input_file}'
'sed -i "s|<photons>[0-9]*</photons>|<photons>{num_particles}</photons>|g" {experiment_run_dir}/{input_file}',
'sed -i "s|<use_gpu_transporter>.*</use_gpu_transporter>|<use_gpu_transporter>{use_gpu}</use_gpu_transporter>|g" {experiment_run_dir}/{input_file}'
])

executable('p', '{branson}/bin/BRANSON {experiment_run_dir}/{input_file}', use_mpi=True)

workload('branson', executables=['setup_experiment','p'])

workload_variable('input_file', default='3D_hohlraum_multi_node.xml',
workload_variable('input_file', default='3D_hohlraum_single_node.xml',
description='input file name',
workloads=['branson'])

Loading