Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update summit modules #21

Merged
merged 14 commits into from
Oct 9, 2023
Merged

Conversation

nkoukpaizan
Copy link
Collaborator

This PR updates the Summit modules.

@nkoukpaizan nkoukpaizan changed the title Nicholson/update summit modules Update summit modules Sep 29, 2023
@nkoukpaizan nkoukpaizan self-assigned this Sep 29, 2023
@cameronrutherford
Copy link
Contributor

Even with changes removing the tests and logging variant, we are still blocked because the non-mpi build needs to import os.

@nkoukpaizan if you want this merged ASAP, we can disable the ~mpi build for now, and re-add once Spack PR is merged.

Copy link
Contributor

@cameronrutherford cameronrutherford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#18 was expanded such that it captures remaining Spack CPU failing build

.github/workflows/spack_cpu_build.yaml Outdated Show resolved Hide resolved
@cameronrutherford
Copy link
Contributor

@ryandanehy or @nkoukpaizan please give this a quick review as GitHub is requiring an additional reviewer (since I made the latest changes).

@cameronrutherford
Copy link
Contributor

Need to re-visit but it seems like Python CMake configuration is erroring when Python is not found. Simple problem but need to make sure we make a robust solution.

@cameronrutherford
Copy link
Contributor

Spack PR was merged, so I think if we update the submodule to include these changes the pipelines will pass

@nkoukpaizan nkoukpaizan force-pushed the nicholson/update-summit-modules branch from 888616a to 3b1772a Compare October 6, 2023 13:16
@nkoukpaizan
Copy link
Collaborator Author

nkoukpaizan commented Oct 6, 2023

Python build vis Spack is failing. I think we need to update the pybind11 submodule. This closed issue in their repository seems relevant: pybind/pybind11#4464

@cameronrutherford
Copy link
Contributor

@nkoukpaizan I am forever thankful that you were able to reproduce #23 :)

cc @jaelynlitz as she was going down a rabbit hole of updating the compiler to catch this bug, but it seems like we don't quite need to do that. @jaelynlitz please only modify compilers on CI platforms if you need to do so in order to upgrade C++ standard.

@nkoukpaizan I agree with your assessment that we need to change our pybind submodule to fix this. We need to also move pybind dep into spack eventually, but this is an easy thing to fix for the moment.

@cameronrutherford
Copy link
Contributor

@nkoukpaizan I am forever thankful that you were able to reproduce #23 :)

cc @jaelynlitz as she was going down a rabbit hole of updating the compiler to catch this bug, but it seems like we don't quite need to do that. @jaelynlitz please only modify compilers on CI platforms if you need to do so in order to upgrade C++ standard.

@nkoukpaizan I agree with your assessment that we need to change our pybind submodule to fix this. We need to also move pybind dep into spack eventually, but this is an easy thing to fix for the moment.

Since this fixes CI, I think this also closes #23

@jaelynlitz
Copy link
Contributor

@nkoukpaizan I am forever thankful that you were able to reproduce #23 :)

cc @jaelynlitz as she was going down a rabbit hole of updating the compiler to catch this bug, but it seems like we don't quite need to do that. @jaelynlitz please only modify compilers on CI platforms if you need to do so in order to upgrade C++ standard.

@nkoukpaizan I agree with your assessment that we need to change our pybind submodule to fix this. We need to also move pybind dep into spack eventually, but this is an easy thing to fix for the moment.

What exactly was the fix? updating pybind? I also see a magma upgrade from 2.6.2 to 2.7.1?

@cameronrutherford
Copy link
Contributor

@nkoukpaizan I am forever thankful that you were able to reproduce #23 :)

cc @jaelynlitz as she was going down a rabbit hole of updating the compiler to catch this bug, but it seems like we don't quite need to do that. @jaelynlitz please only modify compilers on CI platforms if you need to do so in order to upgrade C++ standard.

@nkoukpaizan I agree with your assessment that we need to change our pybind submodule to fix this. We need to also move pybind dep into spack eventually, but this is an easy thing to fix for the moment.

What exactly was the fix? updating pybind? I also see a magma upgrade from 2.6.2 to 2.7.1?

Fix to Python error was pybind. Magma upgrade might cause testing errors, but these are summit modules not run in CI, so it's harder to track.

@cameronrutherford cameronrutherford merged commit 13ef94e into develop Oct 9, 2023
@cameronrutherford cameronrutherford linked an issue Oct 9, 2023 that may be closed by this pull request
@cameronrutherford cameronrutherford added this to the 1.6.0 Release milestone Oct 11, 2023
bjpalmer pushed a commit that referenced this pull request Oct 16, 2023
* Minor fix for Summit build system

* Fix '--nnodes'-->'-nodes' on Summit

* Attempt to update Summit modules

* Reinstall Ginkgo and python dependencies on Summit

* Enforce [email protected] on Summit

* Specify RelWithDebInfo for ExaGO and HiOp on Summit

* Update Spack

* Relax constraints on exago dependencies on Summit

* Add constraints on HiOp in the spack config. Part of the ExaGO package was conflicting with building HiOp in release mode.

* Cleaner module install on Summit

* Update spack_cpu_build.yaml to work without fork

* Update .github/workflows/spack_cpu_build.yaml

* Update Spack

* Try updating pybind11 submodule to see if it fixes errors with exago+python builds

---------

Co-authored-by: Cameron Rutherford <[email protected]>
cameronrutherford pushed a commit that referenced this pull request Oct 25, 2023
* Minor fix for Summit build system

* Fix '--nnodes'-->'-nodes' on Summit

* Attempt to update Summit modules

* Reinstall Ginkgo and python dependencies on Summit

* Enforce [email protected] on Summit

* Specify RelWithDebInfo for ExaGO and HiOp on Summit

* Update Spack

* Relax constraints on exago dependencies on Summit

* Add constraints on HiOp in the spack config. Part of the ExaGO package was conflicting with building HiOp in release mode.

* Cleaner module install on Summit

* Update spack_cpu_build.yaml to work without fork

* Update .github/workflows/spack_cpu_build.yaml

* Update Spack

* Try updating pybind11 submodule to see if it fixes errors with exago+python builds

---------

Co-authored-by: Cameron Rutherford <[email protected]>
bjpalmer pushed a commit that referenced this pull request Oct 30, 2023
* Minor fix for Summit build system

* Fix '--nnodes'-->'-nodes' on Summit

* Attempt to update Summit modules

* Reinstall Ginkgo and python dependencies on Summit

* Enforce [email protected] on Summit

* Specify RelWithDebInfo for ExaGO and HiOp on Summit

* Update Spack

* Relax constraints on exago dependencies on Summit

* Add constraints on HiOp in the spack config. Part of the ExaGO package was conflicting with building HiOp in release mode.

* Cleaner module install on Summit

* Update spack_cpu_build.yaml to work without fork

* Update .github/workflows/spack_cpu_build.yaml

* Update Spack

* Try updating pybind11 submodule to see if it fixes errors with exago+python builds

---------

Co-authored-by: Cameron Rutherford <[email protected]>
bjpalmer pushed a commit that referenced this pull request Nov 9, 2023
* Minor fix for Summit build system

* Fix '--nnodes'-->'-nodes' on Summit

* Attempt to update Summit modules

* Reinstall Ginkgo and python dependencies on Summit

* Enforce [email protected] on Summit

* Specify RelWithDebInfo for ExaGO and HiOp on Summit

* Update Spack

* Relax constraints on exago dependencies on Summit

* Add constraints on HiOp in the spack config. Part of the ExaGO package was conflicting with building HiOp in release mode.

* Cleaner module install on Summit

* Update spack_cpu_build.yaml to work without fork

* Update .github/workflows/spack_cpu_build.yaml

* Update Spack

* Try updating pybind11 submodule to see if it fixes errors with exago+python builds

---------

Co-authored-by: Cameron Rutherford <[email protected]>
bjpalmer pushed a commit that referenced this pull request Nov 20, 2023
* Minor fix for Summit build system

* Fix '--nnodes'-->'-nodes' on Summit

* Attempt to update Summit modules

* Reinstall Ginkgo and python dependencies on Summit

* Enforce [email protected] on Summit

* Specify RelWithDebInfo for ExaGO and HiOp on Summit

* Update Spack

* Relax constraints on exago dependencies on Summit

* Add constraints on HiOp in the spack config. Part of the ExaGO package was conflicting with building HiOp in release mode.

* Cleaner module install on Summit

* Update spack_cpu_build.yaml to work without fork

* Update .github/workflows/spack_cpu_build.yaml

* Update Spack

* Try updating pybind11 submodule to see if it fixes errors with exago+python builds

---------

Co-authored-by: Cameron Rutherford <[email protected]>
cameronrutherford pushed a commit that referenced this pull request Nov 28, 2023
* Minor fix for Summit build system

* Fix '--nnodes'-->'-nodes' on Summit

* Attempt to update Summit modules

* Reinstall Ginkgo and python dependencies on Summit

* Enforce [email protected] on Summit

* Specify RelWithDebInfo for ExaGO and HiOp on Summit

* Update Spack

* Relax constraints on exago dependencies on Summit

* Add constraints on HiOp in the spack config. Part of the ExaGO package was conflicting with building HiOp in release mode.

* Cleaner module install on Summit

* Update spack_cpu_build.yaml to work without fork

* Update .github/workflows/spack_cpu_build.yaml

* Update Spack

* Try updating pybind11 submodule to see if it fixes errors with exago+python builds

---------

Co-authored-by: Cameron Rutherford <[email protected]>
cameronrutherford added a commit that referenced this pull request Nov 29, 2023
* only print error messages if mpi rank is 0

* add rank check for num ranks

* have non-zero ranks exit gracefully when throwing exago error

* pflow functionality tests fully mpi aware

* add logging rank variable

* Apply pre-commmit fixes

* Deleted unused header file.

* Brought SCOPFLOW test driver in line with PFLOW driver.

* Applied additional changes to selfcheck.cpp file for PFLOW, SOPFLOW and SCOPFLOW
to adapt tests for running on multiple MPI ranks.

* Apply pre-commmit fixes

* Initialized some variables that were not getting properly set for serial
test case.

* only print error messages if mpi rank is 0

* pflow functionality tests fully mpi aware

* add logging rank variable

* Apply pre-commmit fixes

* Deleted unused header file.

* Brought SCOPFLOW test driver in line with PFLOW driver.

* Applied additional changes to selfcheck.cpp file for PFLOW, SOPFLOW and SCOPFLOW
to adapt tests for running on multiple MPI ranks.

* Apply pre-commmit fixes

* Update summit modules (#21)

* Minor fix for Summit build system

* Fix '--nnodes'-->'-nodes' on Summit

* Attempt to update Summit modules

* Reinstall Ginkgo and python dependencies on Summit

* Enforce [email protected] on Summit

* Specify RelWithDebInfo for ExaGO and HiOp on Summit

* Update Spack

* Relax constraints on exago dependencies on Summit

* Add constraints on HiOp in the spack config. Part of the ExaGO package was conflicting with building HiOp in release mode.

* Cleaner module install on Summit

* Update spack_cpu_build.yaml to work without fork

* Update .github/workflows/spack_cpu_build.yaml

* Update Spack

* Try updating pybind11 submodule to see if it fixes errors with exago+python builds

---------

Co-authored-by: Cameron Rutherford <[email protected]>

* OPFLOW with RAJA/HIOP sparse GPU solvers (#8)

* OPFLOW: initial implementation of RAJA/HiOp sparse GPU-based solver

WIP - HIOP Sparse solver with GPU model

OPFLOW: Started work on support for HIOP sparse solver interface for GPUs.

Added a copy of hiop sparse solver interface.

OPFLOW: Added model skeleton for GPU sparse version (copying from pbpolrajahiop)

Fixed build

Did some copy paste to add a test for HIOPSPARSE. This test is not actually
functional yet.

Started updating the hiopsparse model and solver code.

More work on updating the solver and model

Added scalar and vector unit tests for model to be used with HIOP sparse solver on GPU

Apply cmake lint

Fix unit tests.

Set the size of array when using Umpire memset.

Code formatting

Some minor changes to get PBPOLRAJAHIOPSPARSE model code to compile

Separate BUS/LINE/GEN/.../Param structs into reusable module

Minor edit

Rename files

Fix typo

Use BUS/LINE/GEN/.../Param structs in Raja HiOp Sparse model (compiles)

Updating HIOP sparse solver GPU API

Completed bounds kernels

Completed scalar and vector functions

WIP - HIOP Sparse solver with GPU model

OPFLOW: Started work on support for HIOP sparse solver interface for GPUs.

Added a copy of hiop sparse solver interface.

OPFLOW: Added model skeleton for GPU sparse version (copying from pbpolrajahiop)

Fixed build

Did some copy paste to add a test for HIOPSPARSE. This test is not actually
functional yet.

Started updating the hiopsparse model and solver code.

More work on updating the solver and model

Added scalar and vector unit tests for model to be used with HIOP sparse solver on GPU

Apply cmake lint

Fix unit tests.

Set the size of array when using Umpire memset.

Code formatting

Rename files

Use BUS/LINE/GEN/.../Param structs in Raja HiOp Sparse model (compiles)

Updating HIOP sparse solver GPU API

Completed bounds kernels

Jacobian and Hessian for sparse model (CPU --> GPU copy)

Use correct array lengths in Eq. Jacobian

Fix bug in Jacobian.

Fix unused variable/parameter errors

OPFLOW: rework solution callback for RAJA/HIOP GPU-based solver

Formatting changes

* Add unit test for RAJA/HiOp Sparse GPU model (9-bus only)

* Apply pre-commmit fixes

* Add test for 200-bus case

* Apply pre-commmit fixes

---------

Co-authored-by: Abhyankar, Shrirang G <[email protected]>

* Upgrade ascent build system and use `[email protected]` on CI platforms (#20)

* Boilerplate scripts to install modules on Ascent via submodule Spack

* Fix '--nnodes'-->'-nodes' on Ascent

* Improve Ascent env.sh

* [email protected] on Ascent

* Apply pre-commmit fixes

* Relax constraints on exago dependencies on Ascent and build ~python

* concretizer: reuse was causing several packages to be duplicated in the environment. Require clean concretizations on  Ascent.

* Minor module update on Ascent

* Add LAPACK_LIBRARIES to Ascent base script. CMAKE was picking up python's openblas otherwise.

* Error with unzip.

* Apply pre-commmit fixes

* Add working build on ascent.

* Add working gcc11.2.0 spack spec.

* Add Ascent Spack pipeline. [ascent-rebuild]

* Update gcc version to 11.2.0 in base.sh [skip-ci]

* Fix stages of Ascent pipeline [ascent-rebuild]

* Add working ascent spack build.

* Add hiop@develop force rebuild to PNNL CI [ascent-rebuild] [newell-rebuild] [deception-rebuild] [incline-rebuild].

* Update Ascent spack built tcl modules

* Only test ascent on tcl module update [ci-skip]

* Update base.sh to disable python on ascent [skip ci]

* Remove LAPACK_LIBRARIES spec [ascent-test]

* Update ascent.gitlab-ci.yml to fix needs/dependencies [ascent-test]

* Update deception spack built tcl modules - [deception-test]

* Try again with Python, but have Spack build it instead of using the external module [ascent-rebuild]

* Force python rebuild on ascent and use [email protected] on incline [ascent-rebuild] [newell-rebuild] [incline-rebuild]

* Pin [email protected] on all CI platforms [decetpion-rebuild] [ascent-rebuild] [newell-rebuild] [incline-rebuild]

* Fix false positive/negative in Ascent pipelines [deception-rebuild] [ascent-test]

* Update incline spack built tcl modules - [incline-test]

* Update newell spack built tcl modules - [newell-test]

* Fix HiOp spec on Ascent [ascent-rebuild].

* Update deception spack built tcl modules - [deception-test]

* Update CPU Spack build with issue for each failing build [ci skip]

* Update Ascent spack built tcl modules [ascent-test]

* Add 1.0.0 dep into CHANGELOG.

* Add ascent-skip to CI to get tests passing [ascent-test]

---------

Co-authored-by: nkoukpaizan <[email protected]>
Co-authored-by: Cameron Rutherford <[email protected]>
Co-authored-by: cameronrutherford <[email protected]>
Co-authored-by: spack-auto-module <[email protected]>

* Add Spack CPU build with `exago+hiop+raja~ipopt ^hiop+raja~sparse` (#41)

* Add CPU build with hiop+sparse and exago~ipopt+hiop+raja

* Update .github/workflows/spack_cpu_build.yaml

* `+mpi` to `+raja` CPU build

* Add HIOPRAJASPARSE model if sparse and raja enabled

* Fix other HIOPRAJASPARSE ifdef

* pflow functionality tests fully mpi aware

* add logging rank variable

* Apply pre-commmit fixes

* Deleted unused header file.

* Brought SCOPFLOW test driver in line with PFLOW driver.

* Applied additional changes to selfcheck.cpp file for PFLOW, SOPFLOW and SCOPFLOW
to adapt tests for running on multiple MPI ranks.

* Apply pre-commmit fixes

* Apply pre-commmit fixes

* Updated third party libraries

* Set more default values in selfcheck.cpp to get rid of uninitialized variables
errors in Valgrind and modified a few test values so that tests pass.

* Apply pre-commmit fixes

* Fixed up some preprocessor glitches that got introduced in the rebase.

* Modified versions on pybind11 and spack to match develop.

* Fix remaining issues in merge request.

* Apply pre-commmit fixes

* Fixed preprocessor directives to match develop branch.

* Modified constructor of FunctionalityTestContext to get rid of a bunch of code
checking MPI calls.

* Apply pre-commmit fixes

* Remove logging ranks variable.

---------

Co-authored-by: Jaelyn Litzinger <[email protected]>
Co-authored-by: Bruce J Palmer <[email protected]>
Co-authored-by: Nicholson Koukpaizan <[email protected]>
Co-authored-by: Cameron Rutherford <[email protected]>
Co-authored-by: Bill <[email protected]>
Co-authored-by: Abhyankar, Shrirang G <[email protected]>
Co-authored-by: nkoukpaizan <[email protected]>
Co-authored-by: cameronrutherford <[email protected]>
Co-authored-by: spack-auto-module <[email protected]>
Co-authored-by: Bruce J Palmer <[email protected]>
@nkoukpaizan nkoukpaizan deleted the nicholson/update-summit-modules branch September 24, 2024 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

exago@develop+python%[email protected] fails in xSDK CI
3 participants