-
Notifications
You must be signed in to change notification settings - Fork 36
Slurm Configuration
Breedbase uses the slurm
system for running analysis and other jobs that may benefit from more computing power than the local virtual machine can provide.
The slurm
system used in Breedbase is based on the Debian packages slurm-llnl
and libslurm-perl
. The latter provides the Slurm.pm
object which is used for querying the status of Slurm jobs.
By default, slurm runs the jobs on the localhost. The config file is included in the breedbase_dockerfile
repo. It assumes a host name of localhost
, so if the host name is different, this needs to be reflected in the /etc/slurm-llnl/slurm.conf
file in several locations.
Importantly, the parameter SelectType
needs to be set to select/cons_res
, and the SelectTypeParameter
needs to be set to CR_CORE
. Defaults will not run more than one job per node. The number of cores needs to be set at the end of the config file. Setting more cores than available on the machine will render slurm non-functional.
To configure Breedbase to run jobs on another host, the sgn_local.conf
file has to be modified in the following way:
backend RemoteSlurm
cluster_host [email protected]
The cluster host has to be specially configured to be able to run jobs:
-
It is important that the cluster host be accessed as the same user as the website is run on, which by default is
www-data
. -
The cluster and the virtual machine need to mount the same
cluster_shared_tmp_dir
(insgn_local.conf
). -
The user needs to be able to login through
ssh
, using host keys. (For example, forwww-data
, the host keys need to be setup in/var/www/.ssh/id_rsa
on the web server, and/var/www/.ssh/authorized_keys
on the cluster host). -
The environment should be setup in
/var/www/.ssh/environment
on the cluster host (for example,$PATH
and$PERL5LIB
variables). Critically, a script that checks the status of the slurm jobs,check_slurm_job.pl
(in thecxgn-corelibs/bin
directory), needs to be in the$PATH
so it can be executed by the slurm system. -
the cluster host needs a copy of the full Breedbase system installed. In the future, hopefully this can be achieved using a docker container, but not yet.