-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge remote-tracking branch 'origin' into 16-setup-benchmarking-infr…
…astructure
- Loading branch information
Showing
12 changed files
with
377 additions
and
22 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,30 +4,85 @@ | |
|
||
## Installation | ||
|
||
### Setup python virtual environment | ||
|
||
|
||
### Development installation | ||
|
||
```bash | ||
export GRIDTOOLS_JL_PATH="..." | ||
export GT4PY_PATH="..." | ||
# create python virtual environemnt | ||
# make sure to use a python version that is compatible with GT4Py | ||
python -m venv .venv | ||
# activate virtual env | ||
# this command has be run everytime GridTools.jl is used | ||
source .venv/bin/activate | ||
# clone gt4py | ||
git clone --branch fix_python_interp_path_in_cmake [email protected]:tehrengruber/gt4py.git | ||
#git clone [email protected]:GridTools/gt4py.git $GT4PY_PATH | ||
pip install -r $GT4PY_PATH/requirements-dev.txt | ||
pip install -e $GT4PY_PATH | ||
# | ||
``` | ||
### Development Installation | ||
|
||
As of August 2024, the recommended Python version for development is **3.10.14**. | ||
|
||
**Important Note:** The Python virtual environment must be created in the directory specified by `GRIDTOOLS_JL_PATH/.venv`. Creating the environment in any other location will result in errors. | ||
|
||
#### Steps to Set Up the Development Environment | ||
|
||
1. **Set Environment Variables:** | ||
Set the environment variables for `GRIDTOOLS_JL_PATH` and `GT4PY_PATH`. Replace `...` with the appropriate paths on your system. | ||
|
||
```bash | ||
export GRIDTOOLS_JL_PATH="..." | ||
export GT4PY_PATH="..." | ||
``` | ||
|
||
2. **Create a Python Virtual Environment:** | ||
Navigate to the `GRIDTOOLS_JL_PATH` directory and create a Python virtual environment named `.venv`. Ensure you are using a compatible Python version (i.e. 3.10.14). | ||
|
||
```bash | ||
cd $GRIDTOOLS_JL_PATH | ||
python3.10 -m venv .venv | ||
``` | ||
|
||
3. **Activate the Virtual Environment:** | ||
Activate the virtual environment. You need to run this command every time you work with GridTools.jl. | ||
|
||
```bash | ||
source .venv/bin/activate | ||
``` | ||
|
||
4. **Clone the GT4Py Repository:** | ||
Clone the GT4Py repository. You can use the specific branch mentioned or the main repository as needed. | ||
|
||
```bash | ||
git clone --branch fix_python_interp_path_in_cmake [email protected]:tehrengruber/gt4py.git | ||
# Alternatively, you can clone the main repository: | ||
# git clone [email protected]:GridTools/gt4py.git $GT4PY_PATH | ||
``` | ||
|
||
5. **Install Required Packages:** | ||
Install the development requirements and the GT4Py package in editable mode. | ||
|
||
```bash | ||
pip install -r $GT4PY_PATH/requirements-dev.txt | ||
pip install -e $GT4PY_PATH | ||
``` | ||
|
||
6. **Build PyCall:** | ||
With the virtual environment activated, run Julia form the `GridTools.jl` folder with the command `julia --project=.` and then build using the following commands: | ||
|
||
```julia | ||
using Pkg | ||
Pkg.build() | ||
``` | ||
|
||
## Troubleshooting | ||
|
||
### Common Build Errors | ||
|
||
__undefined symbol: PyObject_Vectorcall__ | ||
- Make sure to run everything in the same environment that you built `PyCall` with. A common reason for this error is that PyCall was built in a virtual environment and then was not loaded when executing stencils. | ||
|
||
__CMake Error: Could NOT find Boost__ | ||
- GridTools.jl requires the Boost library version 1.65.1 or higher. If Boost is not installed, you can install it via your system's package manager. For example, on Ubuntu, use: | ||
```bash | ||
sudo apt-get install libboost-all-dev | ||
``` | ||
Make sure the installed version meets the minimum required version of 1.65.1. If CMake still cannot find Boost after installation, you may need to manually specify the Boost installation path in the CMake command using the `-DBOOST_ROOT=/path/to/boost` option, where `/path/to/boost` is the directory where Boost is installed. | ||
|
||
__Supporting GPU Backend with CUDA__ | ||
|
||
- To enable GPU acceleration and utilize the GPU backend features of this project, it is essential to have the NVIDIA CUDA Toolkit installed. CUDA provides the necessary compiler (nvcc) and libraries for developing and running applications that leverage NVIDIA GPUs. | ||
|
||
Make sure to run everything in the same environment that you have build `PyCall` with. A common reason is you have built PyCall in a virtual environement and then didn't load it when executing stencils. | ||
- If the `LD_LIBRARY_PATH` environment variable is set in your current environment, it is recommended to unset it. This avoids conflicts between the paths managed by CUDA.jl and those already present on the system. | ||
```julia | ||
julia> using CUDA | ||
┌ Warning: CUDA runtime library `...` was loaded from a system path, `/usr/local/cuda/lib64/...`. | ||
│ | ||
│ This may cause errors. Ensure that you have not set the LD_LIBRARY_PATH | ||
│ environment variable, or that it does not contain paths to CUDA libraries. | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
stages: | ||
- build_base_stage0_image | ||
- build_base_stage1_image | ||
- build_base_stage2_image | ||
- build_image | ||
- ci_jobs | ||
|
||
variables: | ||
GPU_ENABLED: true | ||
CUDA_DRIVER_VERSION: "470.57.02" | ||
PROJECT_NAME: gridtools_jl | ||
PERSIST_IMAGE_NAME: $CSCS_REGISTRY_PATH/pasc_kilos/${CONTAINER_RUNNER}/${PROJECT_NAME}_image:$CI_COMMIT_SHORT_SHA | ||
CPU_ARCH: "x86_64_v3" # use a generic architecture here instead of linux-sles15-haswell, such that it can build on zen2 | ||
|
||
include: | ||
- remote: 'https://gitlab.com/cscs-ci/recipes/-/raw/master/templates/v2/.ci-ext.yml' | ||
|
||
.gt-container-builder: | ||
extends: .container-builder | ||
timeout: 2h | ||
before_script: | ||
- DOCKER_TAG=`eval cat $WATCH_FILECHANGES | sha256sum | head -c 16` | ||
- | | ||
if [[ "$CI_COMMIT_MESSAGE" =~ "Trigger container rebuild $ENV_VAR_NAME" ]]; then | ||
echo "Rebuild triggered." | ||
export CSCS_REBUILD_POLICY="always" | ||
fi | ||
- export PERSIST_IMAGE_NAME=$PERSIST_IMAGE_NAME:$DOCKER_TAG | ||
- echo "$ENV_VAR_NAME=$PERSIST_IMAGE_NAME" > build.env | ||
artifacts: | ||
reports: | ||
dotenv: build.env | ||
variables: | ||
# the variables below MUST be set to a sane value. They are mentioned here, to see | ||
# which variables should be set. | ||
DOCKERFILE: ci/docker/Dockerfile.base # overwrite with the real path of the Dockerfile | ||
PERSIST_IMAGE_NAME: $CSCS_REGISTRY_PATH/base/my_base_image # Important: No version-tag | ||
WATCH_FILECHANGES: 'ci/docker/Dockerfile.base "path/to/another/file with whitespaces.txt"' | ||
ENV_VAR_NAME: BASE_IMAGE | ||
|
||
build_base_stage0_image_job: | ||
stage: build_base_stage0_image | ||
extends: .gt-container-builder | ||
variables: | ||
DOCKERFILE: docker/base/Dockerfile | ||
DOCKER_BUILD_ARGS: '["INSTALL_CUDA_DRIVER=$GPU_ENABLED", "CUDA_DRIVER_VERSION=$CUDA_DRIVER_VERSION", "CPU_ARCH=$CPU_ARCH"]' | ||
PERSIST_IMAGE_NAME: $CSCS_REGISTRY_PATH/gridtools/${CONTAINER_RUNNER}/gridtools_jl_base_image | ||
WATCH_FILECHANGES: 'docker/base/Dockerfile' | ||
ENV_VAR_NAME: BASE_IMAGE_STAGE0 | ||
|
||
build_base_stage1_image_job: | ||
stage: build_base_stage1_image | ||
extends: .gt-container-builder | ||
variables: | ||
DOCKERFILE: docker/base_spack_deps/Dockerfile | ||
DOCKER_BUILD_ARGS: '["BASE_IMAGE=$BASE_IMAGE_STAGE0", "PROJECT_NAME=$PROJECT_NAME", "SPACK_ENV_FILE=spack-${CONTAINER_RUNNER}.yaml"]' | ||
PERSIST_IMAGE_NAME: $CSCS_REGISTRY_PATH/gridtools/${CONTAINER_RUNNER}/${PROJECT_NAME}_base_stage1_image | ||
WATCH_FILECHANGES: 'docker/base/Dockerfile docker/base_spack_deps/Dockerfile docker/base_spack_deps/spack-daint-p100.yaml' # TODO: inherit from stage0 | ||
ENV_VAR_NAME: BASE_IMAGE_STAGE1 | ||
|
||
build_base_stage2_image_job: | ||
stage: build_base_stage2_image | ||
extends: .gt-container-builder | ||
variables: | ||
DOCKERFILE: docker/base_deps/Dockerfile | ||
DOCKER_BUILD_ARGS: '["BASE_IMAGE=$BASE_IMAGE_STAGE1", "PROJECT_NAME=$PROJECT_NAME"]' | ||
PERSIST_IMAGE_NAME: $CSCS_REGISTRY_PATH/gridtools/${CONTAINER_RUNNER}/${PROJECT_NAME}_base_stage2_image | ||
WATCH_FILECHANGES: 'docker/base/Dockerfile docker/base_spack_deps/Dockerfile docker/base_spack_deps/spack-daint-p100.yaml docker/base_deps/Dockerfile' # TODO: inherit from stage1 | ||
ENV_VAR_NAME: BASE_IMAGE_STAGE2 | ||
|
||
build_image: | ||
stage: build_image | ||
extends: .container-builder | ||
variables: | ||
DOCKERFILE: docker/image/Dockerfile | ||
DOCKER_BUILD_ARGS: '["BASE_IMAGE=$BASE_IMAGE_STAGE2", "PROJECT_NAME=$PROJECT_NAME"]' | ||
|
||
run_tests: | ||
stage: ci_jobs | ||
image: $PERSIST_IMAGE_NAME | ||
extends: .container-runner-daint | ||
script: | ||
- . /opt/gridtools_jl_env/setup-env.sh | ||
- cd /opt/GridTools | ||
- julia --project=. -e 'using Pkg; Pkg.test()' | ||
variables: | ||
SLURM_JOB_NUM_NODES: 1 | ||
SLURM_NTASKS: 1 | ||
SLURM_TIMELIMIT: "00:30:00" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# just a counter to trigger rebuilds: 3 | ||
FROM ubuntu:23.04 as builder | ||
ARG INSTALL_CUDA_DRIVER=false | ||
ARG CUDA_DRIVER_VERSION | ||
ARG CPU_ARCH | ||
|
||
SHELL ["/bin/bash", "-c"] | ||
|
||
RUN apt-get update \ | ||
&& env DEBIAN_FRONTEND=noninteractive TZ=Europe/Zurich apt-get -yqq install --no-install-recommends build-essential ca-certificates coreutils curl environment-modules file gfortran git git-lfs gpg gpg-agent lsb-release openssh-client python3 python3-distutils python3-venv unzip zip | ||
|
||
RUN apt-get clean | ||
|
||
WORKDIR /opt/gridtools_jl_env | ||
|
||
COPY ./docker/base/install_cuda_driver.sh ./install_cuda_driver.sh | ||
RUN if [ "x$INSTALL_CUDA_DRIVER" == "xtrue" ]; then ./install_cuda_driver.sh $CUDA_DRIVER_VERSION; fi | ||
|
||
RUN git clone --depth 1 -c feature.manyFiles=true https://github.com/spack/spack.git | ||
|
||
# In case the driver is not installed this fixes missing `-lcuda` errors when installing cupy. | ||
#RUN git remote add origin_tehrengruber https://github.com/tehrengruber/spack.git | ||
#RUN git fetch origin_tehrengruber | ||
#RUN git checkout --track origin_tehrengruber/fix_libcuda_not_found | ||
|
||
WORKDIR ./spack/bin | ||
|
||
# careful: this overrides and will be overriden by other configuration to packages:all:require | ||
RUN ./spack config add packages:all:require:target=$CPU_ARCH | ||
|
||
RUN ./spack install gcc@11 | ||
|
||
# cleanup | ||
RUN ./spack clean --all | ||
RUN ./spack gc -y | ||
|
||
# strip all the binaries | ||
RUN find -L /opt/gridtools_jl_env/spack/opt -type f -exec readlink -f '{}' \; | \ | ||
xargs file -i | \ | ||
grep 'charset=binary' | \ | ||
grep 'x-executable\|x-archive\|x-sharedlib' | \ | ||
awk -F: '{print $1}' | xargs strip -x || true | ||
|
||
WORKDIR / | ||
|
||
# flatten image | ||
FROM scratch | ||
COPY --from=builder / / |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
#!/bin/bash | ||
CUDA_DRIVER_VERSION=$1 | ||
|
||
echo "Installing CUDA driver version $CUDA_DRIVER_VERSION" | ||
apt-get -yqq install --no-install-recommends kmod wget | ||
wget -q https://us.download.nvidia.com/XFree86/Linux-x86_64/${CUDA_DRIVER_VERSION}/NVIDIA-Linux-x86_64-${CUDA_DRIVER_VERSION}.run | ||
chmod +x NVIDIA-Linux-x86_64-${CUDA_DRIVER_VERSION}.run | ||
./NVIDIA-Linux-x86_64-${CUDA_DRIVER_VERSION}.run -s -q -a \ | ||
--no-nvidia-modprobe \ | ||
--no-abi-note \ | ||
--no-kernel-module \ | ||
--no-distro-scripts \ | ||
--no-opengl-files \ | ||
--no-wine-files \ | ||
--no-kernel-module-source \ | ||
--no-unified-memory \ | ||
--no-drm \ | ||
--no-libglx-indirect \ | ||
--no-install-libglvnd \ | ||
--no-systemd | ||
rm ./NVIDIA-Linux-x86_64-${CUDA_DRIVER_VERSION}.run |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
# rebuild counter 3 # just a counter to increase when we want a new image | ||
ARG BASE_IMAGE=gridtools_jl_spack_deps_image | ||
FROM $BASE_IMAGE as builder | ||
ARG PROJECT_NAME | ||
|
||
WORKDIR /opt/${PROJECT_NAME}_env | ||
|
||
COPY ./docker/base_deps/setup-env.sh ./setup-env.sh | ||
RUN sed -i "s/%PROJECT_NAME%/$PROJECT_NAME/g" setup-env.sh | ||
|
||
WORKDIR /opt/ | ||
COPY ./docker/base_deps/install_gt4py.sh ./install_gt4py.sh | ||
RUN . /opt/${PROJECT_NAME}_env/setup-env.sh; ./install_gt4py.sh | ||
RUN . /opt/${PROJECT_NAME}_env/setup-env.sh; pip cache purge | ||
|
||
WORKDIR /opt/gridtools_jl_deps | ||
COPY ./Project.toml ./Project.toml | ||
RUN mkdir src | ||
COPY ./docker/base_deps/dummy_module.jl ./src/GridTools.jl | ||
RUN . /opt/${PROJECT_NAME}_env/setup-env.sh; julia --project=. -e "using Pkg; Pkg.instantiate(); Pkg.build(); Pkg.precompile()" | ||
RUN rm -rf /opt/gridtools_jl_deps | ||
|
||
# flatten image | ||
FROM scratch | ||
COPY --from=builder / / |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
module GridTools | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
#!/bin/bash | ||
git clone --branch fix_python_interp_path_in_cmake https://github.com/tehrengruber/gt4py.git | ||
pip install -r ./gt4py/requirements-dev.txt | ||
pip install ./gt4py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
#!/bin/bash | ||
# note: occurrences of %PROJECT_NAME% in this file are replaced when copied into the container | ||
export HOME=/root | ||
|
||
. /opt/%PROJECT_NAME%_env/spack/share/spack/setup-env.sh | ||
|
||
# gcc is installed outside the env so load it before. In case gcc is not loaded we might run | ||
# into strange errors where partially the spack version and partially the system installed version | ||
# is used. | ||
spack load gcc | ||
|
||
spack env activate %PROJECT_NAME%_env | ||
|
||
# use this complicated way to load packages in case multiple version are installed | ||
# this was needed as two version of py-pip are installed (one is only a build | ||
# dependency). Since we now run `spack gc -y` this is superfluous (build only | ||
# dependencies are removed before we land here), but we keep it for now. | ||
#PACKAGES_TO_LOAD=("python" "py-pip" "gcc") | ||
#for PKG_NAME in ${PACKAGES_TO_LOAD[@]}; do | ||
# SHORT_SPEC=$(spack find --explicit --format "{short_spec}" $PKG_NAME) | ||
# SHORT_SPEC=${SHORT_SPEC%/*} # remove hash after `/` character | ||
# spack load $SHORT_SPEC | ||
#done | ||
spack load python py-pip boost julia |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
# rebuild counter 3 # just a counter to increase when we want a new image | ||
ARG BASE_IMAGE=gridtools_jl_base_image | ||
FROM $BASE_IMAGE as builder | ||
ARG PROJECT_NAME=gridtools_jl | ||
ARG SPACK_ENV_FILE=spack-daint-p100.yaml | ||
|
||
# TODO(tehrengruber): Copy spack environment to clean image. Then we don't need to run `spack gc` | ||
# and `spack clean` anymore. See https://spack.readthedocs.io/en/latest/containers.html for | ||
# more information. | ||
|
||
WORKDIR /opt/${PROJECT_NAME}_env/spack/bin | ||
|
||
COPY ./docker/base_spack_deps/${SPACK_ENV_FILE} ./spack_env_${PROJECT_NAME}.yaml | ||
RUN ./spack env create ${PROJECT_NAME}_env spack_env_${PROJECT_NAME}.yaml | ||
# remove all compilers such that everything is built with the compiler we installed | ||
RUN ./spack compiler remove -a gcc | ||
RUN ./spack -e ${PROJECT_NAME}_env compiler find $(./spack location --install-dir gcc@11) | ||
# using --fresh ensures the concretization does not care about the build cache (untested and not | ||
# used right now as we don't use a build cache yet) | ||
RUN ./spack -e ${PROJECT_NAME}_env concretize --fresh | ||
COPY ./docker/base_spack_deps/run_until_succeed.sh ./run_until_succeed.sh | ||
RUN ./run_until_succeed.sh ./spack -e ${PROJECT_NAME}_env install | ||
|
||
# cleanup | ||
RUN ./spack -e ${PROJECT_NAME}_env clean --all | ||
RUN ./spack -e ${PROJECT_NAME}_env gc -y | ||
|
||
# strip all the binaries | ||
RUN find -L /opt/${PROJECT_NAME}_env/spack/opt -type f -exec readlink -f '{}' \; | \ | ||
xargs file -i | \ | ||
grep 'charset=binary' | \ | ||
grep 'x-executable\|x-archive\|x-sharedlib' | \ | ||
awk -F: '{print $1}' | xargs strip -x || true | ||
|
||
WORKDIR / | ||
|
||
# flatten image | ||
FROM scratch | ||
COPY --from=builder / / |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
#!/bin/bash | ||
|
||
# Set the maximum number of attempts | ||
max_attempts=10 | ||
attempt=0 | ||
|
||
# Check if a command is provided | ||
if [ $# -eq 0 ]; then | ||
echo "Usage: $0 MY_BASH_COMMAND ARGS..." | ||
exit 1 | ||
fi | ||
|
||
# Loop until the command succeeds or the maximum attempts are reached | ||
while ! "$@"; do | ||
attempt=$((attempt + 1)) | ||
if [ $attempt -ge $max_attempts ]; then | ||
echo "Command failed after $max_attempts attempts." | ||
exit 1 | ||
fi | ||
echo "Attempt $attempt/$max_attempts failed. Retrying..." | ||
done | ||
|
||
echo "Command succeeded on attempt $attempt." |
Oops, something went wrong.