Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed TSEBRA failure issue #122

Merged
merged 3 commits into from
Dec 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions .github/workflows/branch.yml
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
name: nf-core branch protection
# This workflow is triggered on PRs to master branch on the repository
# It fails when someone tries to make a PR against the nf-core `master` branch instead of `dev`
# This workflow is triggered on PRs to main branch on the repository
# It fails when someone tries to make a PR against the Plant-Food-Research-Open `main` branch instead of `dev`
on:
pull_request_target:
branches: [master]
branches: [main]

jobs:
test:
runs-on: ubuntu-latest
steps:
# PRs to the nf-core repo master branch are only ok if coming from the nf-core repo `dev` or any `patch` branches
# PRs to the nf-core repo main branch are only ok if coming from the nf-core repo `dev` or any `patch` branches
- name: Check PRs
if: github.repository == 'Plant-Food-Research-Open/genepal'
run: |
Expand All @@ -22,7 +22,7 @@ jobs:
uses: mshick/add-pr-comment@b8f338c590a895d50bcbfa6c5859251edc8952fc # v2
with:
message: |
## This PR is against the `master` branch :x:
## This PR is against the `main` branch :x:

* Do not close this PR
* Click _Edit_ and change the `base` to `dev`
Expand All @@ -32,9 +32,9 @@ jobs:

Hi @${{ github.event.pull_request.user.login }},

It looks like this pull-request is has been made against the [${{github.event.pull_request.head.repo.full_name }}](https://github.com/${{github.event.pull_request.head.repo.full_name }}) `master` branch.
The `master` branch on nf-core repositories should always contain code from the latest release.
Because of this, PRs to `master` are only allowed if they come from the [${{github.event.pull_request.head.repo.full_name }}](https://github.com/${{github.event.pull_request.head.repo.full_name }}) `dev` branch.
It looks like this pull-request is has been made against the [${{github.event.pull_request.head.repo.full_name }}](https://github.com/${{github.event.pull_request.head.repo.full_name }}) `main` branch.
The `main` branch should always contain code from the latest release.
Because of this, PRs to `main` are only allowed if they come from the [${{github.event.pull_request.head.repo.full_name }}](https://github.com/${{github.event.pull_request.head.repo.full_name }}) `dev` branch.

You do not need to close this PR, you can change the target branch to `dev` by clicking the _"Edit"_ button at the top of this page.
Note that even after this, the test will continue to show as failing until you push a new commit.
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/download_pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: Test successful pipeline download with 'nf-core pipelines download'

# Run the workflow when:
# - dispatched manually
# - when a PR is opened or reopened to master branch
# - when a PR is opened or reopened to main branch
# - the head branch of the pull request is updated, i.e. if fixes for a release are pushed last minute to dev.
on:
workflow_dispatch:
Expand All @@ -17,10 +17,10 @@ on:
- edited
- synchronize
branches:
- master
- main
pull_request_target:
branches:
- master
- main

env:
NXF_ANSI_LOG: false
Expand Down
17 changes: 16 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,26 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v0.6.0 - [4-Dec-2024]
## v0.6.0 - [6-Dec-2024]

### 'Added'

1. Added cDNA and CDS outputs to <OUTPUT_DIR>/annotations/<SAMPLE> directory [#118](https://github.com/Plant-Food-Research-Open/genepal/issues/118)
2. Added parameter `add_attrs_to_proteins_cds_fastas`

### `Fixed`

1. Fixed an issue where TSEBRA failed because LIFTOFF lifted non-protein coding genes [#121](https://github.com/Plant-Food-Research-Open/genepal/issues/121)
2. Switched branch name from `master` to `main` in the GHA CIs

### `Dependencies`

1. Nextflow!>=24.04.2
2. [email protected]

### `Deprecated`

1. Removed parameter `add_attrs_to_proteins_fasta`

## v0.5.0 - [21-Nov-2024]

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
- Merge multi-reference liftoffs
- Remove liftoff transcripts marked by _valid_ORF=False_
- Remove liftoff genes with any intron shorter than 10 bp
- Remove rRNA and tRNA from liftoff
- Remove rRNA, tRNA and other non-protein coding models from liftoff
- Optionally, allow or remove iso-forms
- Remove BRAKER models from Liftoff loci
- Merge Liftoff and BRAKER models
Expand Down
4 changes: 2 additions & 2 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@ process { // SUBWORKFLOW: FASTA_LIFTOFF
}

withName: '.*:FASTA_LIFTOFF:GFFREAD_BEFORE_LIFTOFF' {
ext.args = '--no-pseudo --keep-genes'
ext.args = '--no-pseudo --keep-genes -C'
}

withName: '.*:FASTA_LIFTOFF:MERGE_LIFTOFF_ANNOTATIONS' {
Expand All @@ -212,7 +212,7 @@ process { // SUBWORKFLOW: FASTA_LIFTOFF

withName: '.*:FASTA_LIFTOFF:GFFREAD_AFTER_LIFTOFF' {
ext.prefix = { "${meta.id}.liftoff" }
ext.args = '--keep-genes'
ext.args = '--no-pseudo --keep-genes -C'
}

withName: '.*:FASTA_LIFTOFF:GFF_TSEBRA_SPFILTERFEATUREFROMKILLLIST:AGAT_CONVERTSPGFF2GTF' {
Expand Down
Original file line number Diff line number Diff line change
@@ -1,30 +1,30 @@
include { GUNZIP as GUNZIP_FASTA } from '../../modules/nf-core/gunzip/main'
include { GUNZIP as GUNZIP_GFF } from '../../modules/nf-core/gunzip/main'
include { GFFREAD as GFFREAD_BEFORE_LIFTOFF } from '../../modules/nf-core/gffread/main'
include { LIFTOFF } from '../../modules/nf-core/liftoff/main'
include { AGAT_SPMERGEANNOTATIONS as MERGE_LIFTOFF_ANNOTATIONS } from '../../modules/nf-core/agat/spmergeannotations/main'
include { AGAT_SPFLAGSHORTINTRONS } from '../../modules/gallvp/agat/spflagshortintrons/main'
include { AGAT_SPFILTERFEATUREFROMKILLLIST } from '../../modules/nf-core/agat/spfilterfeaturefromkilllist/main'
include { GFFREAD as GFFREAD_AFTER_LIFTOFF } from '../../modules/nf-core/gffread/main'
include { GFF_TSEBRA_SPFILTERFEATUREFROMKILLLIST } from '../../subworkflows/local/gff_tsebra_spfilterfeaturefromkilllist'
include { GUNZIP as GUNZIP_FASTA } from '../../../modules/nf-core/gunzip/main'
include { GUNZIP as GUNZIP_GFF } from '../../../modules/nf-core/gunzip/main'
include { GFFREAD as GFFREAD_BEFORE_LIFTOFF } from '../../../modules/nf-core/gffread/main'
include { LIFTOFF } from '../../../modules/nf-core/liftoff/main'
include { AGAT_SPMERGEANNOTATIONS as MERGE_LIFTOFF_ANNOTATIONS } from '../../../modules/nf-core/agat/spmergeannotations/main'
include { AGAT_SPFLAGSHORTINTRONS } from '../../../modules/gallvp/agat/spflagshortintrons/main'
include { AGAT_SPFILTERFEATUREFROMKILLLIST } from '../../../modules/nf-core/agat/spfilterfeaturefromkilllist/main'
include { GFFREAD as GFFREAD_AFTER_LIFTOFF } from '../../../modules/nf-core/gffread/main'
include { GFF_TSEBRA_SPFILTERFEATUREFROMKILLLIST } from '../../../subworkflows/local/gff_tsebra_spfilterfeaturefromkilllist'

workflow FASTA_LIFTOFF {
take:
target_assemby // Channel: [ meta, fasta ]
xref_fasta // Channel: [ meta2, fasta ]
xref_gff // Channel: [ meta2, gff3 ]
target_assembly // Channel: [ meta, fasta ]
xref_fasta // Channel: [ meta2, fasta(.gz)? ]
xref_gff // Channel: [ meta2, gff3(.gz)? ]
val_filter_liftoff_by_hints // val(true|false)
braker_hints // [ meta, gff ]
tsebra_config // Channel: [ cfg ]
allow_isoforms // val(true|false)
val_allow_isoforms // val(true|false)


main:
ch_versions = Channel.empty()

// MODULE: GUNZIP as GUNZIP_FASTA
ch_xref_fasta_branch = xref_fasta
| branch { meta, file ->
| branch { _meta, file ->
gz: "$file".endsWith(".gz")
rest: !"$file".endsWith(".gz")
}
Expand All @@ -40,7 +40,7 @@ workflow FASTA_LIFTOFF {

// MODULE: GUNZIP as GUNZIP_GFF
ch_xref_gff_branch = xref_gff
| branch { meta, file ->
| branch { _meta, file ->
gz: "$file".endsWith(".gz")
rest: !"$file".endsWith(".gz")
}
Expand All @@ -61,7 +61,7 @@ workflow FASTA_LIFTOFF {
ch_versions = ch_versions.mix(GFFREAD_BEFORE_LIFTOFF.out.versions.first())

// MODULE: LIFTOFF
ch_liftoff_inputs = target_assemby
ch_liftoff_inputs = target_assembly
| combine(
ch_xref_gunzip_fasta
| join(
Expand All @@ -72,7 +72,7 @@ workflow FASTA_LIFTOFF {
[
[
id: "${meta.id}.from.${ref_meta.id}",
target_assemby: meta.id
target_assembly: meta.id
],
target_fa,
ref_fa,
Expand All @@ -81,21 +81,21 @@ workflow FASTA_LIFTOFF {
}

LIFTOFF(
ch_liftoff_inputs.map { meta, target_fa, ref_fa, ref_gff -> [ meta, target_fa ] },
ch_liftoff_inputs.map { meta, target_fa, ref_fa, ref_gff -> ref_fa },
ch_liftoff_inputs.map { meta, target_fa, ref_fa, ref_gff -> ref_gff },
ch_liftoff_inputs.map { meta, target_fa, _ref_fa, _ref_gff -> [ meta, target_fa ] },
ch_liftoff_inputs.map { _meta, _target_fa, ref_fa, _ref_gff -> ref_fa },
ch_liftoff_inputs.map { _meta, _target_fa, _ref_fa, ref_gff -> ref_gff },
[]
)

ch_liftoff_gff3 = LIFTOFF.out.polished_gff3
| map { meta, gff -> [ [ id: meta.target_assemby ], gff ] }
| map { meta, gff -> [ [ id: meta.target_assembly ], gff ] }
| groupTuple

ch_versions = ch_versions.mix(LIFTOFF.out.versions.first())

// MODULE: AGAT_SPMERGEANNOTATIONS as MERGE_LIFTOFF_ANNOTATIONS
ch_merge_inputs = ch_liftoff_gff3
| branch { meta, list_polished ->
| branch { _meta, list_polished ->
one: list_polished.size() == 1
many: list_polished.size() > 1
}
Expand All @@ -119,23 +119,29 @@ workflow FASTA_LIFTOFF {
ch_flagged_gff = AGAT_SPFLAGSHORTINTRONS.out.gff
ch_versions = ch_versions.mix(AGAT_SPFLAGSHORTINTRONS.out.versions.first())

// COLLECTFILE: Kill list for valid_ORF=False transcripts
// tRNA, rRNA
// gene with any intron marked as 'pseudo=' by AGAT/SPFLAGSHORTINTRONS
// collectFile: Kill list for valid_ORF=False transcripts
// tRNA, rRNA, gene with any intron marked as
// 'pseudo=' by AGAT/SPFLAGSHORTINTRONS
ch_kill_list = ch_flagged_gff
| map { meta, gff ->

def tx_from_gff = gff.readLines()
.findAll { it ->
// Can't add to kill list
if ( it.startsWith('#') ) { return false }

def cols = it.split('\t')
def feat = cols[2]

if ( feat in [ 'tRNA', 'rRNA' ] ) { return true }
if ( feat !in [ 'transcript', 'mRNA', 'gene' ] ) { return false }
// Add to kill list anything other than standard features
if ( feat !in [ 'gene', 'transcript', 'mRNA', 'exon', 'CDS', 'five_prime_UTR', 'three_prime_UTR' ] ) { return true }

// Ignore [ 'exon', 'CDS', 'five_prime_UTR', 'three_prime_UTR' ]
if ( feat !in [ 'gene', 'transcript', 'mRNA' ] ) { return false }

def attrs = cols[8]

// Add [ 'gene', 'transcript', 'mRNA' ] with 'valid_ORF=False' or 'pseudo=' attributes to kill list
( attrs.contains('valid_ORF=False') || attrs.contains('pseudo=') )
}
.collect {
Expand All @@ -160,8 +166,8 @@ workflow FASTA_LIFTOFF {


AGAT_SPFILTERFEATUREFROMKILLLIST(
ch_agat_kill_inputs.map { meta, gff, kill -> [ meta, gff ] },
ch_agat_kill_inputs.map { meta, gff, kill -> kill },
ch_agat_kill_inputs.map { meta, gff, _kill -> [ meta, gff ] },
ch_agat_kill_inputs.map { _meta, _gff, kill -> kill },
[] // default config
)

Expand All @@ -179,7 +185,7 @@ workflow FASTA_LIFTOFF {
val_filter_liftoff_by_hints ? ch_attr_trimmed_gff : Channel.empty(),
braker_hints,
tsebra_config,
allow_isoforms,
val_allow_isoforms,
'liftoff'
)

Expand Down
105 changes: 105 additions & 0 deletions subworkflows/local/fasta_liftoff/tests/main.nf.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
nextflow_workflow {

name "Test Subworkflow FASTA_LIFTOFF"
script "../main.nf"
workflow "FASTA_LIFTOFF"
config './nextflow.config'

tag "subworkflows"
tag "subworkflows_gallvp"
tag "subworkflows/fasta_liftoff"
tag "subworkflows/gff_tsebra_spfilterfeaturefromkilllist"

tag "gunzip"
tag "gffread"
tag "liftoff"
tag "agat"
tag "agat/spmergeannotations"
tag "agat/spflagshortintrons"
tag "agat/spfilterfeaturefromkilllist"

setup {
run('GUNZIP', alias: 'GUNZIP_GENOME_FASTA') {
script "../../../../modules/nf-core/gunzip"

process {
"""
input[0] = [
[ id:'test' ],
file(params.modules_testdata_base_path + 'genomics/eukaryotes/actinidia_chinensis/genome/chr1/genome.fasta.gz', checkIfExists: true)
]
"""
}
}

run('GUNZIP', alias: 'GUNZIP_BRAKER_HINTS') {
script "../../../../modules/nf-core/gunzip"

process {
"""
input[0] = [
[ id:'test' ],
file(params.modules_testdata_base_path + 'genomics/eukaryotes/actinidia_chinensis/genome/chr1/genome.hints.gff.gz', checkIfExists: true)
]
"""
}
}
}


test("liftoff - GCF_019202715 - to - actinidia_chinensis") {

when {
workflow {
"""
input[0] = GUNZIP_GENOME_FASTA.out.gunzip

input[1] = Channel.of([
[ id:'ref' ],
file ( "${baseDir}/subworkflows/local/fasta_liftoff/tests/testdata/GCF_019202715.1.fna.gz", checkIfExists: true )
])
input[2] = Channel.of([
[ id:'ref' ],
file ( "${baseDir}/subworkflows/local/fasta_liftoff/tests/testdata/GCF_019202715.1.gff.gz", checkIfExists: true )
])

input[3] = true // val_filter_liftoff_by_hints

input[4] = GUNZIP_BRAKER_HINTS.out.gunzip

input[5] = Channel.of ( file("${baseDir}/assets/tsebra-template.cfg", checkIfExists: true) )
| map { cfg ->
def enforce_full_intron_support = true
def param_intron_support = enforce_full_intron_support ? '1.0' : '0.0'

def param_e1 = params.allow_isoforms ? '0.1' : '0.0'
def param_e2 = params.allow_isoforms ? '0.5' : '0.0'
def param_e3 = params.allow_isoforms ? '0.05' : '0.0'
def param_e4 = params.allow_isoforms ? '0.2' : '0.0'

[
'tsebra-config.cfg',
cfg
.text
.replace('PARAM_INTRON_SUPPORT', param_intron_support)
.replace('PARAM_E1', param_e1)
.replace('PARAM_E2', param_e2)
.replace('PARAM_E3', param_e3)
.replace('PARAM_E4', param_e4)
]
}
| collectFile

input[6] = false // val_allow_isoforms
"""
}
}

then {
assertAll(
{ assert workflow.success},
{ assert snapshot(workflow.out).match()}
)
}
}
}
Loading
Loading