Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cambridge.md #573

Merged
merged 8 commits into from
Oct 30, 2023
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 51 additions & 3 deletions docs/cambridge.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,58 @@ To use, run the pipeline with `-profile cambridge`. This will download and launc
with a setup suitable for the Cambridge HPC cluster. Using this profile, either a docker image containing all of the required software will be downloaded,
and converted to a Singularity image or a Singularity image downloaded directly before execution of the pipeline.

The latest version of Nextflow is not installed by default on the Cambridge HPC cluster. You will need to install it into a directory you have write access to.
Follow these instructions from the Nextflow documentation.
### Install Nextflow

- Install Nextflow : [here](https://www.nextflow.io/docs/latest/getstarted.html#)
The latest version of Nextflow is not installed by default on the Cambridge HPC cluster CSD3. You will need to install it into a directory you have write access to.
Follow [these instructions](https://www.nextflow.io/docs/latest/getstarted.html#) from the Nextflow documentation.

```
# move to desired directory on HPC
cd /home/<username>/path/to/dir
# get the newest version
jfy133 marked this conversation as resolved.
Show resolved Hide resolved
wget -qO- https://get.nextflow.io | bash
```

### Obtain updated java version

The java version on the HPC needs updating.

```
cd ~/
wget https://download.oracle.com/java/20/latest/jdk-20_linux-x64_bin.tar.gz
tar xvfz jdk-20_linux-x64_bin.tar.gz
# add these lines to .bashrc
export JAVA_HOME=/home/ef479/jdk-20.0.1
export PATH=/home/ef479/jdk-20.0.1/bin:$PATH
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also seems dangerous, what if someone needs to use a different tool that isn't v20 compatible? Another reason why I suggest writing Conde instructions above (and install mxf into a separate env)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated this.

# Once above is done “java --version” should return:
java --version
java 20.0.1 2023-04-18
```

### Set up Singularity

Singularity allows the use of containers. You also needs to make a directory to store Singularity cache.

```
module load singularity
jfy133 marked this conversation as resolved.
Show resolved Hide resolved
# make a directory for the cache
mkdir -p /home/<username>/rds/hpc-work/path/to/cache/dir
```

### Run Nextflow

Here is an example with the nf-core pipeline sarek ([read documentation here](https://nf-co.re/sarek/3.3.2)).
The user includes the project name, the node and the cache directory for Singularity.

```
# create a working directory in rds/hpc-work
mkdir /home/<username>/rds/hpc-work/dir/to/test
cd /home/<username>/rds/hpc-work/dir/to/test

# Launch the nf-core pipeline for a test database
# with the Cambridge profile
nextflow run nf-core/sarek -profile test,cambridge.config --partition "cclake" --project "NAME-SL3-CPU" --cacheDir "/home/<username>/rds/hpc-work/path/to/cache/dir" --outdir nf-sarek-test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CacheDir isn't a parameter in any nf-core pipeline afaik, this could break input validatio - particularly if someone uses the config in a non nf-core pipeline that does stricter testing!!!.

You sound instead tell users to use the NXF_SINGULARITY_CACHEDIR bash env variable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. I changed the documentation here fe1d2ec.

```

All of the intermediate files required to run the pipeline will be stored in the `work/` directory. It is recommended to delete this directory after the pipeline
has finished successfully because it can get quite large, and all of the main output files will be saved in the `results/` directory anyway.
Expand Down
Loading