Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RoseTTAFold-All-Atom #220

Open
wants to merge 131 commits into
base: dev
Choose a base branch
from

Conversation

jscgh
Copy link

@jscgh jscgh commented Nov 20, 2024

Adds RoseTTAFold-All-Atom as a module per #197.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

nbtm-sh and others added 30 commits July 29, 2024 13:32
…d-labels-to-gpu-processes

Add labels to GPU processes
…d-labels-to-gpu-processes

Add GPU Compute label
…d-labels-to-gpu-processes

Remove GPU Compute label from CPU pipeline
	modified:   conf/dbs.config
	modified:   modules/local/run_alphafold2.nf
Added variable links in dbs.config and run_alphafold2.nf
…d-dbs-variables-to-msa-pipeline

Make pipeline work on UNSW Katana
@jscgh jscgh changed the title [WIP] Add RoseTTAFold-All-Atom Add RoseTTAFold-All-Atom Nov 27, 2024
@jscgh jscgh marked this pull request as ready for review November 27, 2024 04:22
Copy link
Member

@JoseEspinosa JoseEspinosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just left some minor suggestions but otherwise LGTM Awesome job @jscgh and @nbtm-sh! 🚀

@@ -53,7 +55,7 @@ nextflow run nf-core/proteinfold \
--outdir <OUTDIR>
```

The pipeline takes care of downloading the databases and parameters required by AlphaFold2, Colabfold or ESMFold. In case you have already downloaded the required files, you can skip this step by providing the path to the databases using the corresponding parameter [`--alphafold2_db`], [`--colabfold_db`] or [`--esmfold_db`]. Please refer to the [usage documentation](https://nf-co.re/proteinfold/usage) to check the directory structure you need to provide for each of the databases.
The pipeline takes care of downloading the databases and parameters required by AlphaFold2, Colabfold or ESMFold. In case you have already downloaded the required files, you can skip this step by providing the path to the databases using the corresponding parameter [`--alphafold2_db`], [`--colabfold_db`], [`--esmfold_db`] or ['--rosettafold_all_atom_db']. Please refer to the [usage documentation](https://nf-co.re/proteinfold/usage) to check the directory structure you need to provide for each of the databases.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The pipeline takes care of downloading the databases and parameters required by AlphaFold2, Colabfold or ESMFold. In case you have already downloaded the required files, you can skip this step by providing the path to the databases using the corresponding parameter [`--alphafold2_db`], [`--colabfold_db`], [`--esmfold_db`] or ['--rosettafold_all_atom_db']. Please refer to the [usage documentation](https://nf-co.re/proteinfold/usage) to check the directory structure you need to provide for each of the databases.
The pipeline takes care of downloading the databases and parameters required by AlphaFold2, Colabfold, ESMFold or RoseTTAFold-All-Atom. In case you have already downloaded the required files, you can skip this step by providing the path to the databases using the corresponding parameter [`--alphafold2_db`], [`--colabfold_db`], [`--esmfold_db`] or ['--rosettafold_all_atom_db']. Please refer to the [usage documentation](https://nf-co.re/proteinfold/usage) to check the directory structure you must provide for each database.

ext.prefix = File name prefix for output files.
----------------------------------------------------------------------------------------
*/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't we missing here the declaration to set the path to save the downloaded DBs and parameters? e.g. for af2:

withName: 'GUNZIP|COMBINE_UNIPROT|DOWNLOAD_PDBMMCIF|ARIA2_PDB_SEQRES' {
publishDir = [
path: {"${params.outdir}/DBs/alphafold2/${params.alphafold2_mode}"},
mode: 'symlink',
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
]

Comment on lines +32 to +36
process {
withName: 'RUN_ROSETTAFOLD_ALL_ATOM' {
container = '/srv/scratch/sbf-pipelines/proteinfold/singularity/rosettafold_all_atom.sif'
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
process {
withName: 'RUN_ROSETTAFOLD_ALL_ATOM' {
container = '/srv/scratch/sbf-pipelines/proteinfold/singularity/rosettafold_all_atom.sif'
}
}
process {
withName: 'RUN_ROSETTAFOLD_ALL_ATOM' {
container = 'biocontainers/gawk:5.1.0'
}

@@ -0,0 +1,41 @@
Bootstrap: docker
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the rest of modes, we are just providing the Dockerfiles and not the singularity definitions as we are not making them public. For consistency, we would need to create all the singularity definitions or delete this one. I would do the latter but we can discuss it if you think otherwise.


LABEL Author="[email protected]" \
title="nfcore/proteinfold_rosettafold_all_atom" \
Version="0.9.0" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Version="0.9.0" \
Version="1.2.0dev" \

@@ -214,6 +229,7 @@ profiles {
apptainer {
apptainer.enabled = true
apptainer.autoMounts = true
if (params.use_gpu) { apptainer.runOptions = '--nv' }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will make the Nextflow language server happy

Suggested change
if (params.use_gpu) { apptainer.runOptions = '--nv' }
params.use_gpu ? '--nv' : apptainer.runOptions

@@ -80,7 +80,6 @@
},
"full_dbs": {
"type": "boolean",
"default": false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is set to false by default isn't it?

full_dbs = false // true full_dbs, false reduced_dbs

Suggested change
"default": false,
"default": false,

@@ -675,5 +688,42 @@
{
"$ref": "#/$defs/generic_options"
}
]
],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, maybe create a rosettafold_all_all_atom_dbs_and_parameters_path_options and a rosettafold_all_all_atom_dbs_and_parameters_link_options definitions, actually for modifying the schema you can use nf-core pipelines schema build, see here, which provides a GUI in case it would be handy for you 😄

//
// MODULE: Loaded from modules/local/
//
include { RUN_ROSETTAFOLD_ALL_ATOM } from '../modules/local/run_rosettafold_all_atom'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
include { RUN_ROSETTAFOLD_ALL_ATOM } from '../modules/local/run_rosettafold_all_atom'
include { RUN_ROSETTAFOLD_ALL_ATOM } from '../modules/local/run_rosettafold_all_atom'

Comment on lines +61 to +62
ch_pdb = ch_pdb.mix(RUN_ROSETTAFOLD_ALL_ATOM.out.pdb)
ch_versions = ch_versions.mix(RUN_ROSETTAFOLD_ALL_ATOM.out.versions)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ch_pdb = ch_pdb.mix(RUN_ROSETTAFOLD_ALL_ATOM.out.pdb)
ch_versions = ch_versions.mix(RUN_ROSETTAFOLD_ALL_ATOM.out.versions)
ch_pdb = ch_pdb.mix(RUN_ROSETTAFOLD_ALL_ATOM.out.pdb)
ch_versions = ch_versions.mix(RUN_ROSETTAFOLD_ALL_ATOM.out.versions)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants