Tutorial tweaks

- tipos - make the cluster step testable in jupytext - configure installer to use strict channel priorities Co-authored-by: Anika John <[email protected]> Co-authored-by: Lara Fuhrmann <[email protected]>
cbg-ethz · Nov 2, 2022 · 1df85e9 · 1df85e9
1 parent 650edce
commit 1df85e9
Show file tree

Hide file tree

Showing 5 changed files with 41 additions and 7 deletions.
diff --git a/README.md b/README.md
@@ -31,6 +31,8 @@ general:
 
 Also see [snakemake's documentation](https://snakemake.readthedocs.io/en/stable/executing/cli.html) to learn more about the command-line options available when executing the workflow.
 
+Tutorials introducing usage of V-pipe are available in the [docs/](docs/README.md) subdirectory.
+
 ### Using quick install script
 
 To deploy V-pipe, use the [installation script](utils/README.md#quick-installer) with the following parameters:

diff --git a/docs/README.md b/docs/README.md
@@ -14,9 +14,9 @@ cd tutorial/work/
 # do something
 cd ../..
 ```
-Of course you don't necessarily need to do that.  You can simply remainin the working directory.
+Of course you don't necessarily need to do that.  You can simply remain in the working directory.
 
-When editing files like `config.yaml`, you can use your favorite editor (`vim`, `emacs`, `nano`, [butterflies](https://xkcd.com/378/), etc.). By default our tutorial use heredocs to make it easier to copy-paste the blocks into bash:
+When editing files like `config.yaml`, you can use your favorite editor (`vim`, `emacs`, `nano`, [butterflies](https://xkcd.com/378/), etc.). By default our tutorials use a [_heredoc_](https://en.wikipedia.org/wiki/Here_document) to make it easier to copy-paste the blocks into bash:
 
 ```bash
 cat > config.yaml <<EOF

diff --git a/docs/tutorial_hiv.md b/docs/tutorial_hiv.md
@@ -7,7 +7,7 @@ jupyter:
       extension: .md
       format_name: markdown
       format_version: '1.3'
-      jupytext_version: 1.13.1
+      jupytext_version: 1.14.0
   kernelspec:
     display_name: Python 3
     language: python

diff --git a/docs/tutorial_sarscov2.md b/docs/tutorial_sarscov2.md
@@ -7,7 +7,7 @@ jupyter:
       extension: .md
       format_name: markdown
       format_version: '1.3'
-      jupytext_version: 1.13.1
+      jupytext_version: 1.14.0
   kernelspec:
     display_name: Python 3
     language: python
@@ -218,11 +218,39 @@ The opensource platform SLURM by SchedMD is one of the popular systems you might
 The most user friendly way to submit jobs to the cluster is using a special _snakemake profile_.
 [smk-simple-slurm](https://github.com/jdblischak/smk-simple-slurm) is a profile that works well in our experience with SLURM (for other platforms see suggestions in [the snakemake-profil documentation](https://github.com/snakemake-profiles/doc)).
 
-```console
+```bash
 cd tutorial/
+# download the profile
 git clone https://github.com/jdblischak/smk-simple-slurm.git
+# edit simple/config.yaml and either comment out the partition and qos or adapt to your local HPC
+cat > smk-simple-slurm/simple/config.yaml <<EOT
+cluster:
+  mkdir -p logs/{rule} &&
+  sbatch
+    --cpus-per-task={threads}
+    --mem={resources.mem_mb}
+    --job-name=smk-{rule}-{wildcards}
+    --output=logs/{rule}/{rule}-{wildcards}-%j.out
+  #--partition={resources.partition}
+  #--qos={resources.qos}
+default-resources:
+  #- partition=<name-of-default-partition>
+  #- qos=<name-of-quality-of-service>
+  - mem_mb=1000
+restart-times: 3
+max-jobs-per-second: 10
+max-status-checks-per-second: 1
+local-cores: 1
+latency-wait: 60
+jobs: 500
+keep-going: True
+rerun-incomplete: True
+printshellcmds: True
+scheduler: greedy
+use-conda: True
+EOT
 cd work/
-./vpipe --dry-run --profile ../smk-simple-slurm --jobs 100
+./vpipe --dry-run --profile ../smk-simple-slurm/simple/ --jobs 100
 cd ../..
 ```
 
@@ -236,9 +264,12 @@ In addition, Snakemake has [parameters for conda](https://snakemake.readthedocs.
 - using `-conda-create-envs-only` enables to download the dependencies only without running the pipeline itself. This is very useful if the compute nodes of your cluster are not allowed internet access.
 - using `--conda-prefix=`_{DIR}_ stores the conda environments of dependencies in a common directory (thus possible to share and re-use between multiple instances of V-pipe).
 
-```console
+```bash
 cd tutorial/work/
+# First download all bioconda dependencies ahead of time
 ./vpipe --conda-prefix ../snake-envs --cores 1 --conda-create-envs-only
+# And then run on the cluster, the compute node will not need to download anything
+./vpipe --dry-run --conda-prefix ../snake-envs --profile ../smk-simple-slurm/simple/ --jobs 100
 cd ../..
 ```
 

diff --git a/utils/quick_install.sh b/utils/quick_install.sh
@@ -178,6 +178,7 @@ sh ${MINICONDA} -b -p miniconda3
 conda config --add channels defaults
 conda config --add channels bioconda
 conda config --add channels conda-forge
+conda config --set channel_priority strict
 # NOTE conda-forge *HAS TO* be higher than bioconda
 
 VPIPEENV=