diff --git a/config/arc_cluster/README.md b/config/ARC/README.md similarity index 52% rename from config/arc_cluster/README.md rename to config/ARC/README.md index d24f0a17..294fbbc2 100644 --- a/config/arc_cluster/README.md +++ b/config/ARC/README.md @@ -1,70 +1,50 @@ # Running open-gira on ARC -As open-gira is built with snakemake, its use is remarkably similar from a -laptop to a cluster. However there are a few differences. They are discussed -here. +As open-gira is built using `snakemake`, its use is fairly similar from a +laptop to a cluster. However there are a few differences, notably using a +`profile` (discussed here). ## Python environment -### Initialising our shared conda installation +### Micromamba -There is a conda install which users may share. This means we don't need to -create many duplicate environments unnecessarily (snakemake will check to see -if an equivalent environment has already been created). +I recommend installing +[micromamba](https://mamba.readthedocs.io/en/latest/installation/micromamba-installation.html#install-script) +into your userspace as a package manager for Python packages (and more). -Prior to using the conda install for the first time you must initialise it for -your shell with the following command: -``` -/data/ouce-gri-jba/anaconda/condabin/conda init -``` +### Creating an execution environment -Your `~/.bashrc` should then contain someting like this: +To create an environment on ARC containing the necessary software to run the workflows: ``` -i# >>> conda initialize >>> -# !! Contents within this block are managed by 'conda init' !! - __conda_setup="$('/data/ouce-gri-jba/anaconda/bin/conda' 'shell.bash' 'hook' 2> /dev/null)" -if [ $? -eq 0 ]; then - eval "$__conda_setup" -else - if [ -f "/data/ouce-gri-jba/anaconda/etc/profile.d/conda.sh" ]; then - . "/data/ouce-gri-jba/anaconda/etc/profile.d/conda.sh" - else - export PATH="/data/ouce-gri-jba/anaconda/bin:$PATH" - fi -fi -unset __conda_setup -# <<< conda initialize <<< +micromamba create -f environment.yml -y ``` -### Enabling an environment with snakemake - -We use snakemake to create jobs for us. We could use the ARC provided snakemake -executable from the module load system, but their version is quite old -(6.10.0). N.B. Versions <7.0.0 may cause the following problem: -https://github.com/snakemake/snakemake/issues/1392 - -Instead use a shared conda environment we have created which contains -snakemake: +To activate this: ``` -conda activate snakemake-7.12.1 +micromamba activate open-gira ``` Your prompt should then change to something like: ``` -(snakemake-7.12.1) [cenv0899@arc-login01 ~]$ +(open-gira) [cenv0899@arc-login01 ~]$ ``` -## Osmium +## Exactextract + +There is one dependency of `open-gira` that is not available via the conda +(micromamba) ecosystem, `exactextract`. To install this, see +[here](https://github.com/isciences/exactextract#compiling) and place the +compiled binary in your `PATH`. -open-gira jobs which filter Open Street Map datasets may require the use of a -tool called osmium. This has been compiled on the cluster (with -`/data/ouce-gri-jba/osmium/build_osmium.sh`). To run osmium, place a symlink -somewhere on your `$PATH`, pointing to the wrapper script. For example: +To build (and run) `exactextract` on ARC you will need to use the `module` +program to load two dependencies: ``` -mkdir -p ~/bin -ln -s /data/ouce-gri-jba/osmium/run_osmium.sh ~/bin/osmium +module load GEOS/3.10.3-GCC-11.3.0 +module load GDAL/3.5.0-foss-2022a ``` +I suggest placing these lines in your `~/.bashrc` file so they automatically run on login. + ## Session persistence To persist a terminal over time (and despite dropped SSH connections) consider using `tmux`. @@ -81,29 +61,24 @@ Here's a [friendly guide](https://www.hamvocke.com/blog/a-quick-and-easy-guide-t `tmux attach-session -t ` to reattach to a session. -## Allocate resources +## Invoke workflow -Allocate some nodes for use: +The general pattern to doing work with `open-gira` on `ARC` is to activate the +environment (see above) and issue a request for a target file: ``` -salloc --ntasks-per-node= --nodes= --partition= --time=01:00:00 --mem=8000 +snakemake --profile config/ARC ``` -## Invoke pipeline - -Having allocated resources with `salloc` (see above), you can then invoke -snakemake to dispatch jobs and satisfy your target rule. From the open-gira -repository call the command you wish to run, using the cluster specific -profile. For more details on the cluster execution, see the config.yaml file -in the profile directory. The general pattern is: -``` -snakemake --profile config/arc_cluster -``` +`snakemake` will then identify what work is required, issue job requests to +`SLURM` and monitor the filesystem to watch for completed results. To test the pipeline with a short job, try the following: ``` -snakemake --profile config/arc_cluster results/exposure/tanzania-mini_filter-road/hazard-aqueduct-river/img +snakemake --profile config/ARC results/exposure/tanzania-mini_filter-road/hazard-aqueduct-river/img ``` +Resource allocation is defined per rule, with defaults in `config/ARC/config.yaml`. + ## Interpreting errors Each submitted job will have its `stdout` logged to file. This is very useful @@ -148,4 +123,5 @@ Traceback (most recent call last): FileNotFoundError: [Errno 2] No such file or directory: 'osmium' ``` -In particular, here, the `FileNotFoundError` says that the job runner couldn't find `osmium`, which is needed to run this rule. +In particular, here, the `FileNotFoundError` says that the job runner couldn't +find `osmium`, which is needed to run this rule. diff --git a/config/arc_cluster/config.yaml b/config/ARC/config.yaml similarity index 53% rename from config/arc_cluster/config.yaml rename to config/ARC/config.yaml index 43cd5804..4a900910 100644 --- a/config/arc_cluster/config.yaml +++ b/config/ARC/config.yaml @@ -8,28 +8,21 @@ cluster: --job-name=smk-{rule}-{wildcards} --output=logs/{rule}/{rule}-{wildcards}-%j.out --export=ALL + --cluster=htc + --mail-type=BEGIN,END,FAIL + --mail-user=fred.thomas@eci.ox.ac.uk --parsable default-resources: - qos=standard # {basic, standard, priority} only have credits for standard - partition=short # {short, medium, long, devel, interactive} - mem_mb=16000 - - time="08:00:00" # maximum time for a single job -restart-times: 1 # if a job fails, retry it once -max-jobs-per-second: 16 # maximum jobs to _submit_ per second + - time="01:00:00" # maximum time for a single job +max-jobs-per-second: 1 max-status-checks-per-second: 1 -local-cores: 1 +restart-times: 1 # if a job fails, retry it once latency-wait: 15 # seconds to wait for files to appear before failing -jobs: 64 # max simultaneous jobs +jobs: 4 # max simultaneous jobs keep-going: True # do not stop workflow if job(s) fail rerun-incomplete: True printshellcmds: True scheduler: greedy -cluster-status: status-sacct.sh # script to poll for job status -use-conda: True # activate conda env prior to running any given rule -# the following path is where an anaconda install (with mamba) is located, envs -# installed here by snakemake or otherwise should be reusable across the group -conda-prefix: /data/ouce-gri-jba/anaconda/envs -# use `mamba` to create envs -# micromamba support coming soon? see: -# https://github.com/snakemake/snakemake/pull/1889 -conda-frontend: mamba diff --git a/config/arc_cluster/status-sacct.sh b/config/ARC/status-sacct.sh similarity index 100% rename from config/arc_cluster/status-sacct.sh rename to config/ARC/status-sacct.sh