Skip to content

Commit

Permalink
Fixed an issue where TSEBRA failed because LIFTOFF lifted non-protein…
Browse files Browse the repository at this point in the history
… coding genes
  • Loading branch information
GallVp committed Dec 4, 2024
1 parent ee702d7 commit d91ad05
Show file tree
Hide file tree
Showing 18 changed files with 263 additions and 53 deletions.
16 changes: 8 additions & 8 deletions .github/workflows/branch.yml
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
name: nf-core branch protection
# This workflow is triggered on PRs to master branch on the repository
# It fails when someone tries to make a PR against the nf-core `master` branch instead of `dev`
# This workflow is triggered on PRs to main branch on the repository
# It fails when someone tries to make a PR against the Plant-Food-Research-Open `main` branch instead of `dev`
on:
pull_request_target:
branches: [master]
branches: [main]

jobs:
test:
runs-on: ubuntu-latest
steps:
# PRs to the nf-core repo master branch are only ok if coming from the nf-core repo `dev` or any `patch` branches
# PRs to the nf-core repo main branch are only ok if coming from the nf-core repo `dev` or any `patch` branches
- name: Check PRs
if: github.repository == 'Plant-Food-Research-Open/genepal'
run: |
Expand All @@ -22,7 +22,7 @@ jobs:
uses: mshick/add-pr-comment@b8f338c590a895d50bcbfa6c5859251edc8952fc # v2
with:
message: |
## This PR is against the `master` branch :x:
## This PR is against the `main` branch :x:
* Do not close this PR
* Click _Edit_ and change the `base` to `dev`
Expand All @@ -32,9 +32,9 @@ jobs:
Hi @${{ github.event.pull_request.user.login }},
It looks like this pull-request is has been made against the [${{github.event.pull_request.head.repo.full_name }}](https://github.com/${{github.event.pull_request.head.repo.full_name }}) `master` branch.
The `master` branch on nf-core repositories should always contain code from the latest release.
Because of this, PRs to `master` are only allowed if they come from the [${{github.event.pull_request.head.repo.full_name }}](https://github.com/${{github.event.pull_request.head.repo.full_name }}) `dev` branch.
It looks like this pull-request is has been made against the [${{github.event.pull_request.head.repo.full_name }}](https://github.com/${{github.event.pull_request.head.repo.full_name }}) `main` branch.
The `main` branch should always contain code from the latest release.
Because of this, PRs to `main` are only allowed if they come from the [${{github.event.pull_request.head.repo.full_name }}](https://github.com/${{github.event.pull_request.head.repo.full_name }}) `dev` branch.
You do not need to close this PR, you can change the target branch to `dev` by clicking the _"Edit"_ button at the top of this page.
Note that even after this, the test will continue to show as failing until you push a new commit.
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/download_pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: Test successful pipeline download with 'nf-core pipelines download'

# Run the workflow when:
# - dispatched manually
# - when a PR is opened or reopened to master branch
# - when a PR is opened or reopened to main branch
# - the head branch of the pull request is updated, i.e. if fixes for a release are pushed last minute to dev.
on:
workflow_dispatch:
Expand All @@ -17,10 +17,10 @@ on:
- edited
- synchronize
branches:
- master
- main
pull_request_target:
branches:
- master
- main

env:
NXF_ANSI_LOG: false
Expand Down
2 changes: 1 addition & 1 deletion .nf-core.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,5 @@ template:
outdir: .
skip_features:
- igenomes
version: 0.5.0
version: 0.6.0
update: null
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,22 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v0.6.0 - [04-Dec-2024]

### `Added`

### `Fixed`

1. Fixed an issue where TSEBRA failed because LIFTOFF lifted non-protein coding genes [#121](https://github.com/Plant-Food-Research-Open/genepal/issues/121)
2. Switched branch name from `master` to `main` in the GHA CIs

### `Dependencies`

1. Nextflow!>=24.04.2
2. [email protected]

### `Deprecated`

## v0.5.0 - [21-Nov-2024]

### `Added`
Expand Down
2 changes: 1 addition & 1 deletion CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ authors:
- family-names: "Thomson"
given-names: "Susan"
title: "genepal: A Nextflow pipeline for genome and pan-genome annotation"
version: 0.5.0
version: 0.6.0
date-released: 2024-11-21
url: "https://github.com/Plant-Food-Research-Open/genepal"
doi: 10.5281/zenodo.14195006
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
- Merge multi-reference liftoffs
- Remove liftoff transcripts marked by _valid_ORF=False_
- Remove liftoff genes with any intron shorter than 10 bp
- Remove rRNA and tRNA from liftoff
- Remove rRNA, tRNA and other non-protein coding models from liftoff
- Optionally, allow or remove iso-forms
- Remove BRAKER models from Liftoff loci
- Merge Liftoff and BRAKER models
Expand Down
2 changes: 1 addition & 1 deletion assets/multiqc_config.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
report_comment: >
This report has been generated by the <a href="https://github.com/plant-food-research-open/genepal" target="_blank">plant-food-research-open/genepal</a>
analysis pipeline. For information about how to interpret these results, please see the
<a href="https://github.com/plant-food-research-open/genepal/blob/0.5.0/docs/usage.md" target="_blank">documentation</a>.
<a href="https://github.com/plant-food-research-open/genepal/blob/0.6.0/docs/usage.md" target="_blank">documentation</a>.
report_section_order:
"plant-food-research-open-genepal-methods-description":
Expand Down
4 changes: 2 additions & 2 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@ process { // SUBWORKFLOW: FASTA_LIFTOFF
}

withName: '.*:FASTA_LIFTOFF:GFFREAD_BEFORE_LIFTOFF' {
ext.args = '--no-pseudo --keep-genes'
ext.args = '--no-pseudo --keep-genes -C'
}

withName: '.*:FASTA_LIFTOFF:MERGE_LIFTOFF_ANNOTATIONS' {
Expand All @@ -212,7 +212,7 @@ process { // SUBWORKFLOW: FASTA_LIFTOFF

withName: '.*:FASTA_LIFTOFF:GFFREAD_AFTER_LIFTOFF' {
ext.prefix = { "${meta.id}.liftoff" }
ext.args = '--keep-genes'
ext.args = '--no-pseudo --keep-genes -C'
}

withName: '.*:FASTA_LIFTOFF:GFF_TSEBRA_SPFILTERFEATUREFROMKILLLIST:AGAT_CONVERTSPGFF2GTF' {
Expand Down
2 changes: 1 addition & 1 deletion nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,7 @@ manifest {
description = """A Nextflow pipeline for consensus, phased and pan-genome annotation."""
mainScript = 'main.nf'
nextflowVersion = '!>=24.04.2'
version = '0.5.0'
version = '0.6.0'
doi = 'https://doi.org/10.5281/zenodo.14195006'
}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,30 +1,30 @@
include { GUNZIP as GUNZIP_FASTA } from '../../modules/nf-core/gunzip/main'
include { GUNZIP as GUNZIP_GFF } from '../../modules/nf-core/gunzip/main'
include { GFFREAD as GFFREAD_BEFORE_LIFTOFF } from '../../modules/nf-core/gffread/main'
include { LIFTOFF } from '../../modules/nf-core/liftoff/main'
include { AGAT_SPMERGEANNOTATIONS as MERGE_LIFTOFF_ANNOTATIONS } from '../../modules/nf-core/agat/spmergeannotations/main'
include { AGAT_SPFLAGSHORTINTRONS } from '../../modules/gallvp/agat/spflagshortintrons/main'
include { AGAT_SPFILTERFEATUREFROMKILLLIST } from '../../modules/nf-core/agat/spfilterfeaturefromkilllist/main'
include { GFFREAD as GFFREAD_AFTER_LIFTOFF } from '../../modules/nf-core/gffread/main'
include { GFF_TSEBRA_SPFILTERFEATUREFROMKILLLIST } from '../../subworkflows/local/gff_tsebra_spfilterfeaturefromkilllist'
include { GUNZIP as GUNZIP_FASTA } from '../../../modules/nf-core/gunzip/main'
include { GUNZIP as GUNZIP_GFF } from '../../../modules/nf-core/gunzip/main'
include { GFFREAD as GFFREAD_BEFORE_LIFTOFF } from '../../../modules/nf-core/gffread/main'
include { LIFTOFF } from '../../../modules/nf-core/liftoff/main'
include { AGAT_SPMERGEANNOTATIONS as MERGE_LIFTOFF_ANNOTATIONS } from '../../../modules/nf-core/agat/spmergeannotations/main'
include { AGAT_SPFLAGSHORTINTRONS } from '../../../modules/gallvp/agat/spflagshortintrons/main'
include { AGAT_SPFILTERFEATUREFROMKILLLIST } from '../../../modules/nf-core/agat/spfilterfeaturefromkilllist/main'
include { GFFREAD as GFFREAD_AFTER_LIFTOFF } from '../../../modules/nf-core/gffread/main'
include { GFF_TSEBRA_SPFILTERFEATUREFROMKILLLIST } from '../../../subworkflows/local/gff_tsebra_spfilterfeaturefromkilllist'

workflow FASTA_LIFTOFF {
take:
target_assemby // Channel: [ meta, fasta ]
xref_fasta // Channel: [ meta2, fasta ]
xref_gff // Channel: [ meta2, gff3 ]
target_assembly // Channel: [ meta, fasta ]
xref_fasta // Channel: [ meta2, fasta(.gz)? ]
xref_gff // Channel: [ meta2, gff3(.gz)? ]
val_filter_liftoff_by_hints // val(true|false)
braker_hints // [ meta, gff ]
tsebra_config // Channel: [ cfg ]
allow_isoforms // val(true|false)
val_allow_isoforms // val(true|false)


main:
ch_versions = Channel.empty()

// MODULE: GUNZIP as GUNZIP_FASTA
ch_xref_fasta_branch = xref_fasta
| branch { meta, file ->
| branch { _meta, file ->
gz: "$file".endsWith(".gz")
rest: !"$file".endsWith(".gz")
}
Expand All @@ -40,7 +40,7 @@ workflow FASTA_LIFTOFF {

// MODULE: GUNZIP as GUNZIP_GFF
ch_xref_gff_branch = xref_gff
| branch { meta, file ->
| branch { _meta, file ->
gz: "$file".endsWith(".gz")
rest: !"$file".endsWith(".gz")
}
Expand All @@ -61,7 +61,7 @@ workflow FASTA_LIFTOFF {
ch_versions = ch_versions.mix(GFFREAD_BEFORE_LIFTOFF.out.versions.first())

// MODULE: LIFTOFF
ch_liftoff_inputs = target_assemby
ch_liftoff_inputs = target_assembly
| combine(
ch_xref_gunzip_fasta
| join(
Expand All @@ -72,7 +72,7 @@ workflow FASTA_LIFTOFF {
[
[
id: "${meta.id}.from.${ref_meta.id}",
target_assemby: meta.id
target_assembly: meta.id
],
target_fa,
ref_fa,
Expand All @@ -81,21 +81,21 @@ workflow FASTA_LIFTOFF {
}

LIFTOFF(
ch_liftoff_inputs.map { meta, target_fa, ref_fa, ref_gff -> [ meta, target_fa ] },
ch_liftoff_inputs.map { meta, target_fa, ref_fa, ref_gff -> ref_fa },
ch_liftoff_inputs.map { meta, target_fa, ref_fa, ref_gff -> ref_gff },
ch_liftoff_inputs.map { meta, target_fa, _ref_fa, _ref_gff -> [ meta, target_fa ] },
ch_liftoff_inputs.map { _meta, _target_fa, ref_fa, _ref_gff -> ref_fa },
ch_liftoff_inputs.map { _meta, _target_fa, _ref_fa, ref_gff -> ref_gff },
[]
)

ch_liftoff_gff3 = LIFTOFF.out.polished_gff3
| map { meta, gff -> [ [ id: meta.target_assemby ], gff ] }
| map { meta, gff -> [ [ id: meta.target_assembly ], gff ] }
| groupTuple

ch_versions = ch_versions.mix(LIFTOFF.out.versions.first())

// MODULE: AGAT_SPMERGEANNOTATIONS as MERGE_LIFTOFF_ANNOTATIONS
ch_merge_inputs = ch_liftoff_gff3
| branch { meta, list_polished ->
| branch { _meta, list_polished ->
one: list_polished.size() == 1
many: list_polished.size() > 1
}
Expand All @@ -119,23 +119,29 @@ workflow FASTA_LIFTOFF {
ch_flagged_gff = AGAT_SPFLAGSHORTINTRONS.out.gff
ch_versions = ch_versions.mix(AGAT_SPFLAGSHORTINTRONS.out.versions.first())

// COLLECTFILE: Kill list for valid_ORF=False transcripts
// tRNA, rRNA
// gene with any intron marked as 'pseudo=' by AGAT/SPFLAGSHORTINTRONS
// collectFile: Kill list for valid_ORF=False transcripts
// tRNA, rRNA, gene with any intron marked as
// 'pseudo=' by AGAT/SPFLAGSHORTINTRONS
ch_kill_list = ch_flagged_gff
| map { meta, gff ->

def tx_from_gff = gff.readLines()
.findAll { it ->
// Can't add to kill list
if ( it.startsWith('#') ) { return false }

def cols = it.split('\t')
def feat = cols[2]

if ( feat in [ 'tRNA', 'rRNA' ] ) { return true }
if ( feat !in [ 'transcript', 'mRNA', 'gene' ] ) { return false }
// Add to kill list anything other than standard features
if ( feat !in [ 'gene', 'transcript', 'mRNA', 'exon', 'CDS', 'five_prime_UTR', 'three_prime_UTR' ] ) { return true }

// Ignore [ 'exon', 'CDS', 'five_prime_UTR', 'three_prime_UTR' ]
if ( feat !in [ 'gene', 'transcript', 'mRNA' ] ) { return false }

def attrs = cols[8]

// Add [ 'gene', 'transcript', 'mRNA' ] with 'valid_ORF=False' or 'pseudo=' attributes to kill list
( attrs.contains('valid_ORF=False') || attrs.contains('pseudo=') )
}
.collect {
Expand All @@ -160,8 +166,8 @@ workflow FASTA_LIFTOFF {


AGAT_SPFILTERFEATUREFROMKILLLIST(
ch_agat_kill_inputs.map { meta, gff, kill -> [ meta, gff ] },
ch_agat_kill_inputs.map { meta, gff, kill -> kill },
ch_agat_kill_inputs.map { meta, gff, _kill -> [ meta, gff ] },
ch_agat_kill_inputs.map { _meta, _gff, kill -> kill },
[] // default config
)

Expand All @@ -179,7 +185,7 @@ workflow FASTA_LIFTOFF {
val_filter_liftoff_by_hints ? ch_attr_trimmed_gff : Channel.empty(),
braker_hints,
tsebra_config,
allow_isoforms,
val_allow_isoforms,
'liftoff'
)

Expand Down
Loading

0 comments on commit d91ad05

Please sign in to comment.