Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a test profile based on public data #56

Merged
merged 5 commits into from
Aug 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 9 additions & 6 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# Adopted from https://github.com/nf-core/modules/blob/master/.github/workflows/test.yml

name: Lint and -stub on Linux/Docker
name: CI tests
on:
push:
branches: [main]
branches:
- dev
pull_request:
branches: [main]

# Cancel if a newer run is started
concurrency:
Expand All @@ -30,7 +30,7 @@ jobs:
- name: Run pre-commit
run: pre-commit run --all-files

stub-test:
test:
runs-on: ubuntu-latest
name: Run stub test with docker
env:
Expand All @@ -44,17 +44,20 @@ jobs:
with:
version: "23.04.4"

- name: Disk space cleanup
uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be # v1.3.1

- name: Run stub-test
run: |
nextflow run \
main.nf \
-profile local,docker \
-profile docker \
-stub \
-params-file tests/stub/params.json

confirm-pass:
runs-on: ubuntu-latest
needs: [pre-commit, stub-test]
needs: [pre-commit, test]
if: always()
steps:
- name: All tests ok
Expand Down
9 changes: 7 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## 0.4.0 - [07-Aug-2024]
## 0.4.0+dev - [19-Aug-2024]

### `Added`

Expand All @@ -24,6 +24,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
15. Reduced `BRAKER3` threads to 8 [#55](https://github.com/PlantandFoodResearch/pangene/issues/55)
16. Now the final annotations are stored in the `annotations` folder [#53](https://github.com/PlantandFoodResearch/pangene/issues/53)
17. Added `-gff` flag to `REPEATMASKER` to save the gff file [#54](https://github.com/PlantandFoodResearch/pangene/issues/54)
18. Now a single `fasta` file can be directly specified for `protein_evidence`
19. `eggnogmapper_db_dir` is not a required parameter anymore
20. `eggnogmapper_tax_scope` is now set to 1 (root div) by default
21. Added a `test` profile based on public data

### `Fixed`

Expand All @@ -46,7 +50,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
6. Removed dependency on <https://github.com/kherronism/nf-modules.git> for `BRAKER3` and `REPEATMASKER` modules which are now installed from <https://github.com/GallVp/nxf-components.git>
7. Removed dependency on <https://github.com/PlantandFoodResearch/nxf-modules.git>
8. Now the final annotations are not stored in the `final` folder
9. Now BRAKER3 outputs are not saved by default [#53](https://github.com/PlantandFoodResearch/pangene/issues/53)
9. Now BRAKER3 outputs are not saved by default [#53](https://github.com/PlantandFoodResearch/pangene/issues/53) and saved under `etc` folder when enabled
10. Removed `local` profile. Local executor is the default when no executor is specified. Therefore, the `local` profile was not needed.

## 0.3.3 - [18-Jun-2024]

Expand Down
31 changes: 0 additions & 31 deletions conf/base.config
Original file line number Diff line number Diff line change
@@ -1,34 +1,3 @@
profiles {
pfr {
process {
executor = 'slurm'
}

apptainer {
envWhitelist = 'APPTAINER_BINDPATH,APPTAINER_BIND'
cacheDir = "/workspace/pangene/singularity"
}
}

local {
process {
executor = 'local'
}
}

apptainer {
apptainer.enabled = true
apptainer.autoMounts= true
apptainer.registry = 'quay.io'
}

docker {
docker.enabled = true
docker.runOptions = '-u $(id -u):$(id -g) --platform=linux/amd64'
docker.registry = 'quay.io'
}
}

process {

cpus = { check_max( 1 * task.attempt, 'cpus' ) }
Expand Down
4 changes: 2 additions & 2 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ process { // SUBWORKFLOW: FASTA_BRAKER3
].flatten().unique(false).join(' ').trim()
ext.prefix = { "${meta.id}" }
publishDir = [
path: { "${params.outdir}/braker/" },
path: { "${params.outdir}/etc/braker/" },
mode: "copy",
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
enabled: params.braker_save_outputs
Expand Down Expand Up @@ -335,7 +335,7 @@ process { // Universal

withName: SAVE_MARKED_GFF3 {
publishDir = [
path: { "${params.outdir}/splicing_marked" },
path: { "${params.outdir}/etc/splicing_marked" },
mode: "copy",
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
]
Expand Down
7 changes: 7 additions & 0 deletions conf/test.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
params {
input = "${projectDir}/tests/minimal/assemblysheet.csv"
protein_evidence = 'https://raw.githubusercontent.com/Gaius-Augustus/BRAKER/f58479fe5bb13a9e51c3ca09cb9e137cab3b8471/example/proteins.fa'

braker_extra_args = '--gm_max_intergenic 10000 --skipOptimize' // Added for faster test execution! Do not use with actual data!
busco_lineage_datasets = 'eudicots_odb10'
}
6 changes: 3 additions & 3 deletions docs/parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@ A NextFlow pipeline for pan-genome annotation
| Parameter | Description | Type | Default | Required | Hidden |
| ------------------------- | -------------------------------------------------------------------------------------------------------- | --------- | --------- | -------- | ------ |
| `input` | Target assemblies listed in a CSV sheet | `string` | | True | |
| `protein_evidence` | Protein evidence provided as fasta files listed in a text sheet | `string` | | True | |
| `eggnogmapper_db_dir` | Eggnogmapper database directory | `string` | | True | |
| `eggnogmapper_tax_scope` | Eggnogmapper taxonomy scopre. Eukaryota: 2759, Viridiplantae: 33090, Archaea: 2157, Bacteria: 2, root: 1 | `integer` | | True | |
| `protein_evidence` | Protein evidence provided as a fasta file or multiple fasta files listed in a plain txt file | `string` | | True | |
| `eggnogmapper_db_dir` | Eggnogmapper database directory | `string` | | | |
| `eggnogmapper_tax_scope` | Eggnogmapper taxonomy scopre. Eukaryota: 2759, Viridiplantae: 33090, Archaea: 2157, Bacteria: 2, root: 1 | `integer` | 1 | | |
| `rna_evidence` | FASTQ/BAM samples listed in a CSV sheet | `string` | | | |
| `liftoff_annotations` | Reference annotations listed in a CSV sheet | `string` | | | |
| `orthofinder_annotations` | Additional annotations for orthology listed in a CSV sheet | `string` | | | |
Expand Down
6 changes: 4 additions & 2 deletions local_pangene
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,10 @@ F_BOLD="\033[1m"

nextflow run \
main.nf \
-profile local,docker \
-profile docker,test \
-resume \
$stub \
-params-file pangene-test/params.json \
--max_cpus 8 \
--max_memory '32.GB' \
--eggnogmapper_tax_scope 33090 \
--eggnogmapper_db_dir ../dbs/emapperdb/5.0.2
2 changes: 1 addition & 1 deletion modules/local/utils.nf
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ def idFromFileName(fileName) {
).replaceFirst(
/\.f(ast)?q$/, ''
).replaceFirst(
/\.f(asta|sa|a|as|aa)?$/, ''
/\.f(asta|sa|a|as|aa|na)?$/, ''
).replaceFirst(
/\.gff(3)?$/, ''
).replaceFirst(
Expand Down
50 changes: 33 additions & 17 deletions nextflow.config
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@
includeConfig './conf/base.config'

params {
// Input/output options
input = null
protein_evidence = null
eggnogmapper_db_dir = null
eggnogmapper_tax_scope = null
eggnogmapper_tax_scope = 1
rna_evidence = null
liftoff_annotations = null
orthofinder_annotations = null
Expand All @@ -21,20 +19,20 @@ params {
skip_fastqc = false
skip_fastp = false
min_trimmed_reads = 10000
extra_fastp_args = ""
extra_fastp_args = null
save_trimmed = false
remove_ribo_rna = false
save_non_ribo_reads = false
ribo_database_manifest = "${projectDir}/assets/rrna-db-defaults.txt"

// RNAseq alignment options
star_max_intron_length = 16000
star_align_extra_args = ""
star_align_extra_args = null
star_save_outputs = false
save_cat_bam = false

// Annotation options
braker_extra_args = ""
braker_extra_args = null
braker_save_outputs = false
liftoff_coverage = 0.9
liftoff_identity = 0.9
Expand All @@ -59,15 +57,26 @@ params {
validationS3PathCheck = true
}

manifest {
name = 'pangene'
author = """Usman Rashid, Jason Shiller"""
homePage = 'https://github.com/PlantandFoodResearch/pangene'
description = """A NextFlow pipeline for pan-genome annotation"""
mainScript = 'main.nf'
nextflowVersion = '!>=23.04.4'
version = '0.4.0'
doi = ''
includeConfig './conf/base.config'

profiles {
apptainer {
apptainer.enabled = true
apptainer.autoMounts = true
apptainer.registry = 'quay.io'
}

docker {
docker.enabled = true
docker.runOptions = '-u $(id -u):$(id -g) --platform=linux/amd64'
docker.registry = 'quay.io'
}

test { includeConfig 'conf/test.config' }
}

plugins {
id '[email protected]'
}

def trace_timestamp = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss')
Expand All @@ -84,8 +93,15 @@ trace {
file = "${params.outdir}/pipeline_info/execution_trace_${trace_timestamp}.txt"
}

plugins {
id '[email protected]'
manifest {
name = 'pangene'
author = """Usman Rashid, Jason Shiller"""
homePage = 'https://github.com/PlantandFoodResearch/pangene'
description = """A NextFlow pipeline for pan-genome annotation"""
mainScript = 'main.nf'
nextflowVersion = '!>=23.04.4'
version = '0.4.0+dev'
doi = ''
}

includeConfig './conf/modules.config'
9 changes: 5 additions & 4 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"type": "object",
"fa_icon": "fas fa-terminal",
"description": "",
"required": ["input", "protein_evidence", "eggnogmapper_db_dir", "eggnogmapper_tax_scope", "outdir"],
"required": ["input", "protein_evidence", "outdir"],
"properties": {
"input": {
"type": "string",
Expand All @@ -23,9 +23,9 @@
},
"protein_evidence": {
"type": "string",
"description": "Protein evidence provided as fasta files listed in a text sheet",
"description": "Protein evidence provided as a fasta file or multiple fasta files listed in a plain txt file",
"format": "file-path",
"mimetype": "text/txt",
"pattern": "^\\S+\\.(txt|fa|faa|fna|fsa|fas|fasta)(\\.gz)?$",
"fa_icon": "far fa-file-alt"
},
"eggnogmapper_db_dir": {
Expand All @@ -36,7 +36,8 @@
"eggnogmapper_tax_scope": {
"type": "integer",
"description": "Eggnogmapper taxonomy scopre. Eukaryota: 2759, Viridiplantae: 33090, Archaea: 2157, Bacteria: 2, root: 1",
"minimum": 0
"minimum": 1,
"default": 1
},
"rna_evidence": {
"type": "string",
Expand Down
4 changes: 3 additions & 1 deletion subworkflows/local/gff_eggnogmapper.nf
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,9 @@ workflow GFF_EGGNOGMAPPER {
ch_versions = ch_versions.mix(GFF2FASTA_FOR_EGGNOGMAPPER.out.versions.first())


ch_eggnogmapper_inputs = ch_gffread_fasta
ch_eggnogmapper_inputs = ! db_folder
? Channel.empty()
: ch_gffread_fasta
| combine(Channel.fromPath(db_folder))

EGGNOGMAPPER(
Expand Down
11 changes: 9 additions & 2 deletions subworkflows/local/gff_store.nf
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,15 @@ workflow GFF_STORE {
ch_target_gff // [ meta, gff ]
ch_eggnogmapper_annotations // [ meta, annotations ]
ch_fasta // [ meta, fasta ]
val_describe_gff // val(true|false)

main:
ch_versions = Channel.empty()

// COLLECTFILE: Add eggnogmapper hits to gff
ch_described_gff = ch_target_gff
ch_described_gff = ! val_describe_gff
? Channel.empty()
: ch_target_gff
| join(ch_eggnogmapper_annotations)
| map { meta, gff, annotations ->
def tx_annotations = annotations.readLines()
Expand Down Expand Up @@ -109,7 +112,11 @@ workflow GFF_STORE {
}

// MODULE: GT_GFF3 as FINAL_GFF_CHECK
FINAL_GFF_CHECK ( ch_described_gff )
ch_final_check_input = val_describe_gff
? ch_described_gff
: ch_target_gff

FINAL_GFF_CHECK ( ch_final_check_input )

ch_final_gff = FINAL_GFF_CHECK.out.gt_gff3
ch_versions = ch_versions.mix(FINAL_GFF_CHECK.out.versions.first())
Expand Down
7 changes: 6 additions & 1 deletion subworkflows/local/purge_nohit_models.nf
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,11 @@ workflow PURGE_NOHIT_MODELS {
ch_versions = ch_versions.mix(AGAT_SPFILTERFEATUREFROMKILLLIST.out.versions.first())

emit:
purged_gff = ch_target_purged_gff.mix(val_purge_nohits ? Channel.empty() : ch_target_gff)
purged_gff = ch_target_purged_gff
| mix(
val_purge_nohits
? Channel.empty()
: ch_target_gff
)
versions = ch_versions // [ versions.yml ]
}
2 changes: 2 additions & 0 deletions tests/minimal/assemblysheet.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
tag,fasta,is_masked
a_thaliana,https://raw.githubusercontent.com/Gaius-Augustus/BRAKER/f58479fe5bb13a9e51c3ca09cb9e137cab3b8471/example/genome.fa,yes
6 changes: 6 additions & 0 deletions tests/minimal/params.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"input": "tests/minimal/assemblysheet.csv",
"protein_evidence": "https://raw.githubusercontent.com/Gaius-Augustus/BRAKER/f58479fe5bb13a9e51c3ca09cb9e137cab3b8471/example/proteins.fa",
"braker_extra_args": "--gm_max_intergenic 10000 --skipOptimize",
"busco_lineage_datasets": "eudicots_odb10"
}
Loading
Loading