Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic HPC Install Script #329

Merged
merged 71 commits into from
Oct 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
aa0c077
Copy slurm_init.sh to slurm_init_longleaf.sh
TimothyWillard Sep 13, 2024
22cb130
Restore slurm_init.sh
TimothyWillard Sep 13, 2024
90e29b2
Merge branch 'copy-file' into GH-191/longleaf-batch-submission
TimothyWillard Sep 13, 2024
c1d54ef
Added UNC Longleaf Specific Init/Prerun Scripts
TimothyWillard Sep 13, 2024
e238d66
Draft implementation of HPC install script
TimothyWillard Sep 16, 2024
2436640
Minor changes to `hpc_install.sh`
TimothyWillard Sep 16, 2024
bba583a
Added slurm `--partition` flag to inference script
TimothyWillard Sep 16, 2024
2c8c952
Initial pass at HPC install on rockfish
TimothyWillard Oct 2, 2024
194d2e1
Remove longleaf specific slurm scripts
TimothyWillard Oct 2, 2024
f76d68a
Remove `--partion` flag
TimothyWillard Oct 2, 2024
8af3698
Minor updates to `hpc_install.sh`
TimothyWillard Oct 3, 2024
23d9de1
initial tweaks to make flepimop-inference-* runnable
pearsonca Sep 26, 2024
9ac9cf0
further install scripts fixes
pearsonca Sep 26, 2024
d32a58c
fix reinvocation of inference-slot
pearsonca Sep 26, 2024
9f2c085
initial installation for ubuntu re-org
pearsonca Sep 25, 2024
e2bc41f
updates addressing use of installed r scripts
pearsonca Sep 26, 2024
3095ae4
README revs
pearsonca Sep 26, 2024
1b33dc3
add arrow installation
pearsonca Sep 26, 2024
ec0c479
Switch R pkg install to use `build/setup.R`
TimothyWillard Oct 3, 2024
e754364
Add `$WORKDIR` to `hpc_install.R`
TimothyWillard Oct 4, 2024
5e4f398
Add missing flepi path arg to setup.R
TimothyWillard Oct 4, 2024
0d9f813
Force pin arrow version between python and R
TimothyWillard Oct 4, 2024
9ca12ed
Change rockfish default directories
TimothyWillard Oct 4, 2024
f78ee75
Split `hpc_install.sh` into init and install
TimothyWillard Oct 4, 2024
6d69186
Use `devtools::install` in `setup.R`
TimothyWillard Oct 4, 2024
7adbdfa
Unset error exit around R pkg install
TimothyWillard Oct 7, 2024
ae6666f
Add `set +e` an exit to `flepi_init.sh`
TimothyWillard Oct 7, 2024
f3fb1a7
`install_cli` installs to conda bin
TimothyWillard Oct 7, 2024
e443626
Remove init call from install script
TimothyWillard Oct 7, 2024
480bfd7
Remove old version restrictions, add `optparse`
TimothyWillard Oct 8, 2024
385eeb5
Move R deps install into conda environment
TimothyWillard Oct 8, 2024
f0571ae
Readd inference CLI install
TimothyWillard Oct 8, 2024
53f2f49
Update example command to use installed CLI
TimothyWillard Oct 8, 2024
ec7978d
Manually install `covidcast` package
TimothyWillard Oct 8, 2024
2e6dfca
Downgrade r-base dependency to 4.3
TimothyWillard Oct 9, 2024
42259a2
Remove symlinks if exists on reinstall
TimothyWillard Oct 9, 2024
aed8442
Script to generate `environment.yml`
TimothyWillard Oct 10, 2024
a4631db
Add dnachun to channels, add r-sf dependency
TimothyWillard Oct 10, 2024
b0685c5
GitHub action to generate `environment.yml`
TimothyWillard Oct 10, 2024
a9e9f47
Remove unneeded comment
TimothyWillard Oct 10, 2024
9acda2c
Merge main into GH-191/auto-generate-environment.yml
TimothyWillard Oct 10, 2024
ca10838
GitHub action checkout with attached head
TimothyWillard Oct 10, 2024
6ad753d
Update environment.yml
TimothyWillard Oct 10, 2024
27a80f7
Make clear source of `environment.yml` commit
TimothyWillard Oct 10, 2024
bacf6f4
Use premade `environment.yml` in `hpc_install.sh`
TimothyWillard Oct 10, 2024
c613f09
Remove `setup.R` and `install_ubuntu.sh`
TimothyWillard Oct 10, 2024
8ca2f58
Restore `README.md`
TimothyWillard Oct 10, 2024
cbcdfbd
Bug fix to only remove files present
TimothyWillard Oct 10, 2024
3b33ab1
Add spaces for style in `build/hpc_install.sh`
TimothyWillard Oct 10, 2024
a9c3415
Merge main into GH-191/longleaf-batch-submission
TimothyWillard Oct 14, 2024
04c533b
Merge main into GH-191/longleaf-batch-submission
TimothyWillard Oct 18, 2024
039d311
Move everything on longleaf into `/work`
TimothyWillard Oct 18, 2024
37e1405
Rename install script to clarify uses
TimothyWillard Oct 18, 2024
b1c6e46
Change exec of hpc install script
TimothyWillard Oct 18, 2024
384c219
Change longleaf dirs in init script
TimothyWillard Oct 18, 2024
96df05a
Move rockfish userdir
TimothyWillard Oct 18, 2024
cbcfff2
Change default python to 3.11
TimothyWillard Oct 18, 2024
8fa9c82
Merge main into GH-191/longleaf-batch-submission
TimothyWillard Oct 18, 2024
e1811b6
Make flepi path/conda configurable
TimothyWillard Oct 18, 2024
52edb16
Update `environment.yml` via GitHub action
TimothyWillard Oct 18, 2024
22484f6
Formatting of flepi path/conda inputs
TimothyWillard Oct 18, 2024
1f3987b
Use `realpath` to make format file paths
TimothyWillard Oct 18, 2024
8ba4474
Remove python/R name check
TimothyWillard Oct 18, 2024
24988cd
Remove whitespace from `README.md`
TimothyWillard Oct 18, 2024
216ee1f
Add missing slurm module for rockfish
TimothyWillard Oct 21, 2024
139a8b2
Add `--editable` to `gempyor` install
TimothyWillard Oct 21, 2024
b73b5ab
Cleanup error handling
TimothyWillard Oct 21, 2024
70ced08
Update conda env to use `~/.conda`
TimothyWillard Oct 21, 2024
b34f9ab
Remove `--force-reinstall` from pip install
TimothyWillard Oct 22, 2024
8d8042a
Minor typo
TimothyWillard Oct 22, 2024
86b5cd6
Merge main into GH-191/longleaf-batch-submission
TimothyWillard Oct 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions .github/workflows/conda-env.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
name: Generate Conda Environment

on:
workflow_dispatch:
push:
paths:
- build/create_environment_yml.R
- flepimop/R_packages/*/DESCRIPTION
branches:
- main
pull_request:
paths:
- build/create_environment_yml.R
- flepimop/R_packages/*/DESCRIPTION
branches:
- main

jobs:
generate-environment-yml:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.ref }}
- uses: r-lib/actions/setup-r@v2
- name: Generate Environment YAML
run: Rscript build/create_environment_yml.R
- name: Check For Environment Change
run: |
if [[ -n "$(git status -s -- environment.yml)" ]]; then
git config --global user.name "${{ github.actor }}"
git config --global user.email "${{ github.actor }}@users.noreply.github.com"
git add environment.yml
git commit -m 'Update `environment.yml` via GitHub action'
git push origin ${{ github.event.pull_request.head.ref }}
fi
134 changes: 134 additions & 0 deletions batch/hpc_init.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# Generic setup
set -e

# Cluster specific setup
if [[ $1 == "longleaf" ]]; then
# Setup general purpose user variables needed for Longleaf
USERO=$( echo $USER | awk '{ print substr($0, 1, 1) }' )
USERN=$( echo $USER | awk '{ print substr($0, 2, 1) }' )
WORKDIR=$( realpath "/work/users/$USERO/$USERN/$USER/" )
USERDIR=$WORKDIR

# Load required modules
module purge
module load gcc/9.1.0
module load anaconda/2023.03
module load git
elif [[ $1 == "rockfish" ]]; then
# Setup general purspose user variables needed for RockFish
WORKDIR=$( realpath "/scratch4/struelo1/flepimop-code/$USER/" )
USERDIR=$WORKDIR
mkdir -vp $WORKDIR

# Load required modules
module purge
module load slurm
module load gcc/9.3.0
module load anaconda/2020.07
module load git/2.42.0
else
echo "The cluster name '$1' is not recognized, must be one of: 'longleaf', 'rockfish'."
set +e
exit 1
fi

# Ensure we have a $FLEPI_PATH
if [ -z "${FLEPI_PATH}" ]; then
echo -n "An explicit \$FLEPI_PATH was not provided, please set one (or press enter to use '$USERDIR/flepiMoP'): "
read FLEPI_PATH
if [ -z "${FLEPI_PATH}" ]; then
export FLEPI_PATH="$USERDIR/flepiMoP"
fi
export FLEPI_PATH=$( realpath "$FLEPI_PATH" )
echo "Using '$FLEPI_PATH' for \$FLEPI_PATH."
fi

# Conda init
if [ -z "${FLEPI_CONDA}" ]; then
echo -n "An explicit \$FLEPI_CONDA was not provided, please set one (or press enter to use 'flepimop-env'): "
read FLEPI_CONDA
if [ -z "${FLEPI_CONDA}" ]; then
export FLEPI_CONDA="flepimop-env"
fi
echo "Using '$FLEPI_CONDA' for \$FLEPI_CONDA."
fi
conda activate $FLEPI_CONDA

# Check the conda environment is valid
WHICH_PYTHON=$( which python )
WHICH_R=$( which R )
PYTHON_ARROW_VERSION=$( python -c "import pyarrow; print(pyarrow.__version__)" )
R_ARROW_VERSION=$( Rscript -e "cat(as.character(packageVersion('arrow')))" )
COMPATIBLE_ARROW_VERSION=$( echo "$R_ARROW_VERSION" | grep "$PYTHON_ARROW_VERSION" | wc -l )
if [[ "$COMPATIBLE_ARROW_VERSION" -ne 1 ]]; then
echo "The R version of arrow is '$R_ARROW_VERSION' and the python version is '$PYTHON_ARROW_VERSION'. These may not be compatible versions."
fi

# Make sure the credentials is is where we expect and have the right perms
if [ ! -f "$USERDIR/slack_credentials.sh" ]; then
echo "You should place sensitive credentials in '$USERDIR/slack_credentials.sh'."
else
chmod 600 $USERDIR/slack_credentials.sh
source $USERDIR/slack_credentials.sh
fi

# Set correct env vars
export FLEPI_STOCHASTIC_RUN=false
export FLEPI_RESET_CHIMERICS=TRUE
export TODAY=`date --rfc-3339='date'`

echo -n "Please set a project path (relative to '$WORKDIR'): "
read PROJECT_PATH
export PROJECT_PATH="$WORKDIR/$PROJECT_PATH"
if [ ! -d $PROJECT_PATH ]; then
echo "> The project path provided, $PROJECT_PATH, is not a directory. Please ensure this is correct."
fi

echo -n "Please set a config path (relative to '$PROJECT_PATH'): "
read CONFIG_PATH
export CONFIG_PATH="$PROJECT_PATH/$CONFIG_PATH"
if [ ! -f $CONFIG_PATH ]; then
echo "> The config path provided, $CONFIG_PATH, is not a file. Please ensure this is correct."
fi

echo -n "Please set a validation date (today is $TODAY): "
read VALIDATION_DATE

echo -n "Please set a resume location: "
read RESUME_LOCATION

echo -n "Please set a flepi run index: "
read FLEPI_RUN_INDEX

# Done
cat << EOM
> The HPC init script has successfully finished.

If you are testing if this worked, say installing for the first time, you can use the inference example from the \`flepimop_sample\` repository:
\`\`\`bash
cd \$PROJECT_PATH
flepimop-inference-main -c \$CONFIG_PATH -j 1 -n 1 -k 1
\`\`\`
Just make sure to \`rm -r model_output\` after running.

Otherwise make sure this diagnostic info looks correct before continuing:
* Cluster: $1
* User directory: $USERDIR
* Work directory: $WORKDIR
* Flepi conda: $FLEPI_CONDA
* Flepi path: $FLEPI_PATH
* Project path: $PROJECT_PATH
* Python: $WHICH_PYTHON
* R: $WHICH_R
* Python arrow: $PYTHON_ARROW_VERSION
* R arrow: $R_ARROW_VERSION
* Stochastic run: $FLEPI_STOCHASTIC_RUN
* Reset chimerics: $FLEPI_RESET_CHIMERICS
* Today: $TODAY
* Config path: $CONFIG_PATH
* Validation date: $VALIDATION_DATE
* Resume location: $RESUME_LOCATION
* Flepi run index: $FLEPI_RUN_INDEX
EOM

set +e
60 changes: 60 additions & 0 deletions build/create_environment_yml.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
#!/usr/bin/env Rscript

# Helper functions
split_pkgs <- \(x) unique(unlist(strsplit(gsub("\\s+", "", x), ",")))

# Light argument parsing
args <- commandArgs(trailingOnly = TRUE)
flepi_path <- if (length(args)) args[1L] else getwd()

# Get R package dependencies
rpkgs <- list.files(
file.path(flepi_path, "flepimop", "R_packages"),
full.names = TRUE
)
dependencies <- sapply(rpkgs, function(rpkg) {
description <- read.dcf(file.path(rpkg, "DESCRIPTION"))
sections <- c("Depends", "Imports")
contained_sections <- sections %in% colnames(description)
if (sum(contained_sections) >= 1L) {
return(split_pkgs(description[, sections[contained_sections]]))
}
character()
}, USE.NAMES = FALSE)
dependencies <- sort(unique(unlist(dependencies)))
dependencies <- setdiff(
dependencies,
c("arrow", "covidcast", "methods", basename(rpkgs))
)
dependencies <- dependencies[!grepl("^R(\\(.*\\))?$", dependencies)]

# Construct environment.yml file
environment_yml <- file.path(flepi_path, "environment.yml")
new_environment_yml <- c(
"channels:",
"- conda-forge",
"- defaults",
"- r",
"- dnachun",
"dependencies:",
"- python=3.11",
"- pip",
"- r-base>=4.3",
"- pyarrow=17.0.0",
"- r-arrow=17.0.0",
"- r-sf",
paste0("- r-", dependencies)
)
if (file.exists(environment_yml)) {
old_environment_yml <- readLines(environment_yml)
} else {
old_environment_yml <- character()
}
old_environment_yml <- old_environment_yml[!grepl("^#", old_environment_yml)]
if (!identical(new_environment_yml, old_environment_yml)) {
new_environment_yml <- c(
paste0("# ", format(Sys.time(), "%a %b %d %X %Y %Z")),
new_environment_yml
)
writeLines(new_environment_yml, environment_yml)
}
106 changes: 106 additions & 0 deletions build/hpc_install_or_update.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
#!/usr/bin/env bash

# Generic setup
set -e

# Cluster specific setup
if [[ $1 == "longleaf" ]]; then
# Setup general purpose user variables needed for Longleaf
USERO=$( echo $USER | awk '{ print substr($0, 1, 1) }' )
USERN=$( echo $USER | awk '{ print substr($0, 2, 1) }' )
WORKDIR=$( realpath "/work/users/$USERO/$USERN/$USER/" )
USERDIR=$WORKDIR

# Load required modules
module purge
module load gcc/9.1.0
module load anaconda/2023.03
module load git
elif [[ $1 == "rockfish" ]]; then
# Setup general purspose user variables needed for RockFish
WORKDIR=$( realpath "/scratch4/struelo1/flepimop-code/$USER/" )
USERDIR=$WORKDIR
mkdir -vp $WORKDIR

# Load required modules
module purge
module load gcc/9.3.0
module load anaconda/2020.07
module load git/2.42.0
else
echo "The cluster name '$1' is not recognized, must be one of: 'longleaf', 'rockfish'."
set +e
exit 1
fi

# Ensure we have a $FLEPI_PATH
if [ -z "${FLEPI_PATH}" ]; then
echo -n "An explicit \$FLEPI_PATH was not provided, please set one (or press enter to use '$USERDIR/flepiMoP'): "
read FLEPI_PATH
if [ -z "${FLEPI_PATH}" ]; then
export FLEPI_PATH="$USERDIR/flepiMoP"
fi
export FLEPI_PATH=$( realpath "$FLEPI_PATH" )
echo "Using '$FLEPI_PATH' for \$FLEPI_PATH."
fi

# Test that flepiMoP is located there
if [ ! -d "$FLEPI_PATH" ]; then
while true; do
read -p "Did not find flepiMoP at $FLEPI_PATH, do you want to clone the repo? (y/n) " resp
case "$resp" in
[yY])
echo "Cloning on your behalf."
git clone [email protected]:HopkinsIDD/flepiMoP.git $FLEPI_PATH
break
;;
[nN])
echo "Then you need to set a \$FLEPI_PATH before running, cannot proceed with install."
set +e
exit 1
;;
*)
echo "Invalid input. Please enter 'y' or 'n'. "
;;
esac
done
fi

# Setup the conda environment
if [ -z "${FLEPI_CONDA}" ]; then
echo -n "An explicit \$FLEPI_CONDA was not provided, please set one (or press enter to use 'flepimop-env'): "
read FLEPI_CONDA
if [ -z "${FLEPI_CONDA}" ]; then
export FLEPI_CONDA="flepimop-env"
fi
echo "Using '$FLEPI_CONDA' for \$FLEPI_CONDA."
fi
FLEPI_CONDA_ENV_MATCHES=$( conda info --envs | awk '{print $1}' | grep -x "$FLEPI_CONDA" | wc -l )
if [ "$FLEPI_CONDA_ENV_MATCHES" -eq 0 ]; then
conda env create --name $FLEPI_CONDA --file $FLEPI_PATH/environment.yml
fi

# Load the conda environment
conda activate $FLEPI_CONDA
[ -e "$CONDA_PREFIX/conda-meta/pinned" ] && rm $CONDA_PREFIX/conda-meta/pinned
cat << EOF > $CONDA_PREFIX/conda-meta/pinned
r-arrow==17.0.0
arrow==17.0.0
EOF

# Install the gempyor package from local
pip install --editable $FLEPI_PATH/flepimop/gempyor_pkg

# Install the local R packages
R -e "install.packages('covidcast', repos='https://cloud.r-project.org')"
RETURNTO=$( pwd )
cd $FLEPI_PATH/flepimop/R_packages/
for d in $( ls ); do
R CMD INSTALL $d
done
cd $RETURNTO
R -e "library(inference); inference::install_cli()"

# Done
echo "> Done installing/updating flepiMoP."
set +e
39 changes: 39 additions & 0 deletions environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Fri Oct 18 18:29:25 2024 UTC
channels:
- conda-forge
- defaults
- r
- dnachun
dependencies:
- python=3.11
- pip
- r-base>=4.3
- pyarrow=17.0.0
- r-arrow=17.0.0
- r-sf
- r-data.table
- r-doParallel
- r-dplyr
- r-foreach
- r-ggplot2
- r-ggraph
- r-httr
- r-jsonlite
- r-lubridate
- r-magrittr
- r-MMWRweek
- r-optparse
- r-purrr
- r-readr
- r-reticulate
- r-rlang
- r-stringr
- r-tibble
- r-tidygraph
- r-tidyr
- r-tidyselect
- r-tidyverse
- r-truncnorm
- r-vroom
- r-xts
- r-yaml
4 changes: 2 additions & 2 deletions flepimop/R_packages/flepiconfig/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ Package: flepiconfig
Title: Config creation helper for flepiMoP
Version: 3.0.0
Imports:
tidyverse (>= 1.3.1),
readr (>= 2.0.0),
tidyverse,
readr,
lubridate,
magrittr,
yaml,
Expand Down
Loading
Loading