Triton agony aunt

So you want to use Triton ...

Well the first thing to do is set up Triton. Thankfully, there already exists a great resource from the Aalto scientific computing guys.

Interfacing with Triton

One can either choose to work solely through the command line, or can seamlessly integrate with VSCode by SSH.

Conda environments

It is naturally best practice to use environments when dealing with as complex a beast as Triton. This goes for both python and R. In both cases, one must specify the package installation locations. A comprehensive tutorial can be found here. If using R, packages can be listed as

# environment.yml
name: my_project
channels:
 - conda-forge
dependencies:
 - r-base
 - r-brms
 - r-tidyverse
 - r-devtools

and then an environment created by running

module load miniconda
conda env create --file environment.yml

these environments are then activated by each individual shell script in turn by adding

module load miniconda
source activate my_project

A general project template

All of the main work should be executed from one's /scratch/work/ directory since it has higher memory than the root folder. A nice template structure for an individual project might look like (with .conda_envs and .conda_pkgs initialised automatically by following this tutorial)

/scratch/work/
├── .conda_envs
├── .conda_pkgs
└── my_project
    ├── data
    │   └── experiment_data.csv
    ├── img
    │   └── experiment.pdf
    ├── R
    │   └── experiment.R
    ├── out
    │   └── experiment.out
    ├── shell
    │   └── experiment.sh
    └── environment.yml

wherein our experiment.R file runs an experiment, perhaps something of the form

set.seed(300416)

# define experiment function 
experiment <- function (mu) {
  y <- rnorm(100, mu, 1)
  res <- data.frame(y = y, mu = mu)
  return (res)
}

# define different values
mus <- c(0, 1, 10)

# perform experiment
res <- parallel::mcMap(
  f = experiment, 
  mu = mus,
  # user the number of CPUs defined in shell
  mc.cores = Sys.getenv('SLURM_CPUS_PER_TASK')

# concatenate experiment results
df <- do.call("rbind", res)

# write the table
setwd("/scratch/work/<my-user-name>/<project-root>")
csv_name <- paste0("./data/experiment_data.csv")
ff <- file(csv_name, open="w")
write.table(df, file = ff, sep = ",", row.names = FALSE)
close(ff)

This script is then called by the shell file

#!/bin/bash
#SBATCH --time=00:10:00
#SBATCH --cpus-per-task=3
#SBATCH --mem=1G
#SBATCH --output=./out/experiment.out

# Set the number of OpenMP-threads to 1,
# as we're using parallel for parallelization
export OMP_NUM_THREADS=1

# Load the version of R you want to use
module load miniconda
source activate my_project

# Run R scripts
srun Rscript ./R/experiment.R

Saving data

Note in this above example that for a parallelised process we first concatenate the data from all parallel runs before writing to csv. This is helps avoid confusion in memory access.

Cookiecutter template

To set up a research project utilizing Triton, you can use this cookiecutter template to initialize a directory with (roughly) the structure proposed. In nutshell, cookiecutter creates projects from templates (template = cookiecutter) and aims to reduce the amount of boilerplate code you need to write.

You can install cookiecutter with pip:

pip install cookiecutter

and then run

cookiecutter gh:LeeviLindgren/cookiecutter-R-triton

on the folder in which you wish to create the new project. Cookiecutter will ask you to fill in some details such as your name and the project name. Cookiecutter will create a new directory with the name you provided with a README.md-file which should help get you started. For instance, it shows how you can run an example job in Triton (or any other cluster using Slurm.)

Pre-Configured Stan Image

Given the multitude of installation approaches and optimisations available for Stan and its interfaces, we've put together a Docker image with the latest versions of cmdstanr and rstan installed and maximally optimised.

Triton's interface to Containers (e.g., Docker) is Singularity, which can also run Docker containers with no extra configuration. The image and its R packages are stored in the /scratch/cs/bayes_ave folder, so you don't need to build or store the image in your own storage directory.

Using the Image

Using the Docker image requires minimal changes to an existing analysis script. As an example, take the following script which executes an R analysis without the image:

#!/bin/bash
#SBATCH --time=00:10:00
#SBATCH --cpus-per-task=3
#SBATCH --mem=1G
#SBATCH --output=./out/experiment.out

# Set the number of OpenMP-threads to 1,
# as we're using parallel for parallelization
export OMP_NUM_THREADS=1

# Run R scripts
srun Rscript ./R/experiment.R

To run this analysis using the image, we simply replace the srun command with a call to the Singularity container:

#!/bin/bash
#SBATCH --time=00:10:00
#SBATCH --cpus-per-task=3
#SBATCH --mem=1G
#SBATCH --output=./out/experiment.out

# Set the number of OpenMP-threads to 1,
# as we're using parallel for parallelization
export OMP_NUM_THREADS=1

# Run R scripts
singularity run -B /scratch,/m,/l,/share /scratch/cs/bayes_ave/stan-triton.sif Rscript ./R/experiment.R

The command:

singularity run -B /scratch,/m,/l,/share /scratch/cs/bayes_ave/stan-triton.sif

Will execute the Rscript command within the container. You can replace this Rscript ... command with whichever command you want to run in the container. The -B /scratch,/m,/l,/share option mounts Triton's filesystems within the image so that you can access your work directory, and so that cmdstanr can access the CmdStan installation in /scratch/cs/bayes_ave.

R Packages

Because the compiler toolchain and system libraries will differ between the image and the Triton cluster itself, mixing R packages built under Triton with those built under the image can result in errors when calling them. Instead, make sure that you only install packages using the image, and do not mount your own package library. The image is configured to install packages within your /scratch directory, but you need to first create the folder otherwise R will not recognise it.

To do so, run:

mkdir -p /scratch/work/${USER}/stan-triton/R/library

Then you can install and use R packages as usual

Installing `rstan`

Pre-built packages for the latest version of rstan are StanHeaders are provided for easy use with the image. To install, run:

# Need to make sure dependencies are already available:
install.packages(c("RcppParallel","RcppEigen","inline","gridExtra","loo","V8","BH"))

# Install binaries
install.packages("/scratch/cs/bayes_ave/rstan/StanHeaders_2.32.1.9000_R_x86_64-pc-linux-gnu.tar.gz", repos = NULL)
install.packages("/scratch/cs/bayes_ave/rstan/rstan_2.32.1.9000_R_x86_64-pc-linux-gnu.tar.gz", repos = NULL)

OpenCL Acceleration

To use OpenCL/GPU acceleration within the image, you need to both request a GPU partition in your analysis script and launch the image with GPU support. Using the same example script from above, this is done by adding #SBATCH --gres=gpu:1 to the configuration header, and adding --nv to the singularity call:

#!/bin/bash
#SBATCH --time=00:10:00
#SBATCH --cpus-per-task=3
#SBATCH --mem=1G
#SBATCH --gres=gpu:1
#SBATCH --output=./out/experiment.out

# Set the number of OpenMP-threads to 1,
# as we're using parallel for parallelization
export OMP_NUM_THREADS=1

# Run R scripts
singularity run --nv -B /scratch,/m,/l,/share /scratch/cs/bayes_ave/stan-triton.sif Rscript ./R/experiment.R

The --nv option is critical here, otherwise Triton will not provide GPU access to the image.

For a quick tutorial on GPU-Acceleration in cmdstanr see this article: https://mc-stan.org/cmdstanr/articles/opencl.html

Exposing Triton environment variables and saving slurm output

It is sometimes interesting to access the job ID, array ID, or the number of cores requested by a Triton job in your R script. Well, you're in luck, these three can be retrieved painlessly with

job_id <- Sys.getenv("SLURM_ARRAY_JOB_ID", 0)
array_id <- Sys.getenv("SLURM_ARRAY_TASK_ID", 0)
ncores <- as.integer(Sys.getenv("SLURM_JOB_CPUS_PER_NODE", 2))

Once we have run the Triton job from the root project directory, it will automatically spit out all output into a slurm-%A_%a.out file (%A here being the environment variable identifying the job ID, and %a the array ID) and dump this file into your root directory. So as not to clog up your root directory with such slurm*.out files, create a specific directory in which you wish these files to be dumped, say slurm_output, and add in the following to header of the bash script executing the job

#SBATCH --output=slurm_ouput/%A_%a.out

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Triton agony aunt

So you want to use Triton ...

Interfacing with Triton

Conda environments

A general project template

Saving data

Cookiecutter template

Pre-Configured Stan Image

Using the Image

R Packages

Installing `rstan`

OpenCL Acceleration

Exposing Triton environment variables and saving slurm output

Clone this wiki locally

Triton agony aunt

So you want to use Triton ...

Interfacing with Triton

Conda environments

A general project template

Saving data

Cookiecutter template

Pre-Configured Stan Image

Using the Image

R Packages

Installing rstan

OpenCL Acceleration

Exposing Triton environment variables and saving slurm output

Clone this wiki locally

Installing `rstan`