diff --git a/CHANGELOG.md b/CHANGELOG.md index d75e52e..82c0129 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -13,6 +13,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 1. Now using `${meta.id}_trim` as prefix for `FASTQC` files 2. Added `monochromeLogs` parameter to suppress warnings +3. Updated citations to include DOIs ### `Dependencies` diff --git a/CITATIONS.md b/CITATIONS.md index e40e80b..a8a71a7 100644 --- a/CITATIONS.md +++ b/CITATIONS.md @@ -10,95 +10,97 @@ ## Pipeline tools -- py_fasta_validator, [MIT](https://github.com/linsalrob/py_fasta_validator/blob/master/LICENSE) +- AGAT, [GPL v3](https://github.com/NBISweden/AGAT/blob/master/LICENSE) - > Edwards, R.A. 2019. fasta_validate: a fast and efficient fasta validator written in pure C. doi: + > Jacques Dainat, Darío Hereñú, Dr. K. D. Murray, Ed Davis, Ivan Ugrin, Kathryn Crouch, LucileSol, Nuno Agostinho, pascal-git, Zachary Zollman, & tayyrov. (2024). NBISweden/AGAT: AGAT-v1.4.1 (v1.4.1). Zenodo. doi: 10.5281/zenodo.13799920 -- GenomeTools, [ISC](http://genometools.org/license.html) +- AUGUSTUS, [Artistic license-1.0](https://github.com/Gaius-Augustus/Augustus/blob/master/src/LICENSE.TXT) - > Gremme G, Steinbiss S, Kurtz S. 2013. "GenomeTools: A Comprehensive Software Library for Efficient Processing of Structured Genome Annotations," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 10, no. 3, pp. 645-656, May 2013, doi: + > Sommerfeld, D., Lingner, T., Stanke, M., Morgenstern, B., & Richter, H. (2009). AUGUSTUS at MediGRID: Adaption of a bioinformatics application to grid computing for efficient genome analysis. Future Gener. Comput. Syst., 25, 337-345. doi: 10.1016/j.future.2008.05.010 -- SAMTOOLS, [MIT/Expat](https://github.com/samtools/samtools/blob/develop/LICENSE) +- BRAKER3, [Artistic license-1.0](https://github.com/Gaius-Augustus/BRAKER/blob/master/LICENSE.TXT) - > Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H. 2021. Twelve years of SAMtools and BCFtools, GigaScience, Volume 10, Issue 2, February 2021, giab008, + > Gabriel, L., Bruna, T., Hoff, K. J., Ebel, M., Lomsadze, A., Borodovsky, M., Stanke, M. (2023). BRAKER3: Fully Automated Genome Annotation Using RNA-Seq and Protein Evidence with GeneMark-ETP, AUGUSTUS and TSEBRA. bioRxiV, doi: 10.1101/2023.06.10.544449 - BUSCO, [MIT](https://gitlab.com/ezlab/busco/-/blob/master/LICENSE) - > Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. 2021. BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes, Molecular Biology and Evolution, Volume 38, Issue 10, October 2021, Pages 4647–4654, + > Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. 2021. BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes, Molecular Biology and Evolution, Volume 38, Issue 10, October 2021, Pages 4647–4654, doi: 10.1093/molbev/msab199 -- GffRead, [MIT](https://github.com/gpertea/gffread/blob/master/LICENSE) +- EDTA, [GPL v3](https://github.com/oushujun/EDTA/blob/master/LICENSE) - > Pertea G, Pertea M. GFF Utilities: GffRead and GffCompare. F1000Res. 2020 Apr 28;9:ISCB Comm J-304. doi: . PMID: 32489650; PMCID: PMC7222033. + > Ou S., Su W., Liao Y., Chougule K., Agda J. R. A., Hellinga A. J., Lugo C. S. B., Elliott T. A., Ware D., Peterson T., Jiang N., Hirsch C. N. and Hufford M. B. (2019). Benchmarking Transposable Element Annotation Methods for Creation of a Streamlined, Comprehensive Pipeline. Genome Biol. 20(1): 275. doi: 10.1186/s13059-019-1905-y -- SEQKIT, [MIT](https://github.com/shenwei356/seqkit/blob/master/LICENSE) +- EggNOG-mapper, [GPL v3](https://github.com/eggnogdb/eggnog-mapper/blob/master/LICENSE.txt) - > Shen W, Le S, Li Y, Hu F. 2016. SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation. PLoS ONE 11(10): e0163962. + > eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Carlos P. Cantalapiedra, Ana Hernandez-Plaza, Ivica Letunic, Peer Bork, Jaime Huerta-Cepas. 2021. Molecular Biology and Evolution, msab293, doi: 10.1093/molbev/msab293 -- FASTP, [MIT](https://github.com/OpenGene/fastp/blob/master/LICENSE) +- FASTQC, [GPL v3](https://github.com/s-andrews/FastQC/blob/master/LICENSE.txt) - > Chen S, Zhou Y, Chen Y, Gu J. 2018. fastp: an ultra-fast all-in-one FASTQ preprocessor, Bioinformatics, Volume 34, Issue 17, 01 September 2018, Pages i884–i890, + > Andrews, S. (2010). Babraham Bioinformatics - FastQC a quality control tool for high throughput sequence data. Babraham.ac.uk. url: https://www.bioinformatics.babraham.ac.uk/projects/fastqc -- FASTQC, [GPL v3](https://github.com/s-andrews/FastQC/blob/master/LICENSE.txt) +- FASTP, [MIT](https://github.com/OpenGene/fastp/blob/master/LICENSE) - > + > Chen S, Zhou Y, Chen Y, Gu J. 2018. fastp: an ultra-fast all-in-one FASTQ preprocessor, Bioinformatics, Volume 34, Issue 17, 01 September 2018, Pages i884–i890, doi: 10.1093/bioinformatics/bty560 -- AGAT, [GPL v3](https://github.com/NBISweden/AGAT/blob/master/LICENSE) +- GeneMark-ETP, [Attribution-NonCommercial-ShareAlike 4.0 International](https://github.com/gatech-genemark/GeneMark-ETP/blob/main/License-Creative-Commons-Attribution-NonCommercial-ShareAlike-4.0-International.txt) - > Jacques Dainat, Darío Hereñú, Dr. K. D. Murray, Ed Davis, Ivan Ugrin, Kathryn Crouch, LucileSol, Nuno Agostinho, pascal-git, Zachary Zollman, & tayyrov. (2024). NBISweden/AGAT: AGAT-v1.4.1 (v1.4.1). Zenodo. https://doi.org/10.5281/zenodo.13799920 + > Brůna T, Lomsadze A, Borodovsky M. GeneMark-ETP significantly improves the accuracy of automatic annotation of large eukaryotic genomes. Genome Res. 2024 Jun 25;34(5):757-768. doi: 10.1101/gr.278373.123. PMID: 38866548; PMCID: PMC11216313. -- BRAKER, [Artistic license-1.0](https://github.com/Gaius-Augustus/BRAKER/blob/master/LICENSE.TXT) +- GenomeTools, [ISC](http://genometools.org/license.html) - > Stanke, M., Diekhans, M., Baertsch, R. and Haussler, D. (2008). Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics, doi: 10.1093/bioinformatics/btn013. + > Gremme G, Steinbiss S, Kurtz S. 2013. "GenomeTools: A Comprehensive Software Library for Efficient Processing of Structured Genome Annotations," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 10, no. 3, pp. 645-656, May 2013, doi: 10.1109/TCBB.2013.68 - > Stanke. M., Schöffmann, O., Morgenstern, B. and Waack, S. (2006). Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics 7, 62. +- GffCompare, [MIT](https://github.com/gpertea/gffcompare/blob/master/LICENSE) - > Gabriel, L., Bruna, T., Hoff, K. J., Ebel, M., Lomsadze, A., Borodovsky, M., Stanke, M. (2023). BRAKER3: Fully Automated Genome Annotation Using RNA-Seq and Protein Evidence with GeneMark-ETP, AUGUSTUS and TSEBRA. bioRxiV, doi: 10.1101/2023.06.10.544449. + > Pertea G, Pertea M. GFF Utilities: GffRead and GffCompare. F1000Res. 2020 Apr 28;9:ISCB Comm J-304. doi: 10.12688/f1000research.23297.2. PMID: 32489650; PMCID: PMC7222033. - > Bruna, T., Lomsadze, A., Borodovsky, M. (2023). GeneMark-ETP: Automatic Gene Finding in Eukaryotic Genomes in Consistence with Extrinsic Data. bioRxiv, doi: 10.1101/2023.01.13.524024. +- GffRead, [MIT](https://github.com/gpertea/gffread/blob/master/LICENSE) - > Kovaka, S., Zimin, A. V., Pertea, G. M., Razaghi, R., Salzberg, S. L., & Pertea, M. (2019). Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome biology, 20(1):1-13. + > Pertea G, Pertea M. GFF Utilities: GffRead and GffCompare. F1000Res. 2020 Apr 28;9:ISCB Comm J-304. doi: 10.12688/f1000research.23297.2. PMID: 32489650; PMCID: PMC7222033. -- EDTA, [GPL v3](https://github.com/oushujun/EDTA/blob/master/LICENSE) +- Liftoff, [GPL v3](https://github.com/agshumate/Liftoff/blob/master/LICENSE.md) - > Ou S., Su W., Liao Y., Chougule K., Agda J. R. A., Hellinga A. J., Lugo C. S. B., Elliott T. A., Ware D., Peterson T., Jiang N., Hirsch C. N. and Hufford M. B. (2019). Benchmarking Transposable Element Annotation Methods for Creation of a Streamlined, Comprehensive Pipeline. Genome Biol. 20(1): 275. doi: + > Shumate A, Salzberg SL. Liftoff: accurate mapping of gene annotations. Bioinformatics. 2021 Jul 19;37(12):1639-1643. doi: 10.1093/bioinformatics/btaa1016. PMID: 33320174; PMCID: PMC8289374. -- RepeatMasker, [Open Software License v. 2.1](https://github.com/rmhubley/RepeatMasker/blob/master/LICENSE) +- OrthoFinder, [GPL v3](https://github.com/davidemms/OrthoFinder/blob/master/License.md) - > + > Emms, D.M., Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol 20, 238 (2019). doi: 10.1186/s13059-019-1832-y -- EggNOG-mapper, [GPL v3](https://github.com/eggnogdb/eggnog-mapper/blob/master/LICENSE.txt) +- ProtHint, [License for GeneMark family software ("Product")](https://github.com/gatech-genemark/ProtHint/blob/master/LICENSE) + + > Tomáš Brůna, Alexandre Lomsadze, Mark Borodovsky, GeneMark-EP+: eukaryotic gene prediction with self-training in the space of genes and proteins, NAR Genomics and Bioinformatics, Volume 2, Issue 2, June 2020, lqaa026, doi: 10.1093/nargab/lqaa026 - > eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Carlos P. Cantalapiedra, Ana Hernandez-Plaza, Ivica Letunic, Peer Bork, Jaime Huerta-Cepas. 2021. Molecular Biology and Evolution, msab293, +- py_fasta_validator, [MIT](https://github.com/linsalrob/py_fasta_validator/blob/master/LICENSE) - > eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Jaime Huerta-Cepas, Damian Szklarczyk, Davide Heller, Ana Hernández-Plaza, Sofia K Forslund, Helen Cook, Daniel R Mende, Ivica Letunic, Thomas Rattei, Lars J Jensen, Christian von Mering, Peer Bork Nucleic Acids Res. 2019 Jan 8; 47(Database issue): D309–D314. doi: 10.1093/nar/gky1085 + > Edwards, R.A. 2019. fasta_validate: a fast and efficient fasta validator written in pure C. doi: 10.5281/zenodo.2532044 - > Sensitive protein alignments at tree-of-life scale using DIAMOND. Buchfink B, Reuter K, Drost HG. 2021. Nature Methods 18, 366–368 (2021). +- RepeatMasker, [Open Software License v. 2.1](https://github.com/rmhubley/RepeatMasker/blob/master/LICENSE) -- Liftoff, [GPL v3](https://github.com/agshumate/Liftoff/blob/master/LICENSE.md) + > Smit, A., & Hubley, R. (2023). RepeatMasker: a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. Repeatmasker.org. url: https://www.repeatmasker.org - > Shumate, Alaina, and Steven L. Salzberg. 2020. “Liftoff: Accurate Mapping of Gene Annotations.” Bioinformatics , December. +- RepeatModeler, [Open Software License v. 2.1](https://github.com/Dfam-consortium/RepeatModeler/blob/master/LICENSE) -- OrthoFinder, [GPL v3](https://github.com/davidemms/OrthoFinder/blob/master/License.md) + > Hubley, R. (2023). RepeatModeler: a de novo transposable element (TE) family identification and modeling package. Repeatmasker.org. url: https://www.repeatmasker.org - > Emms, D.M. and Kelly, S. (2019) OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biology 20:238 +- Samtools, [MIT/Expat](https://github.com/samtools/samtools/blob/develop/LICENSE) - > Emms, D.M. and Kelly, S. (2015) OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biology 16:157 + > Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H. 2021. Twelve years of SAMtools and BCFtools, GigaScience, Volume 10, Issue 2, February 2021, giab008, doi: 10.1093/gigascience/giab008 -- RepeatModeler, [Open Software License v. 2.1](https://github.com/Dfam-consortium/RepeatModeler/blob/master/LICENSE) +- SeqKit, [MIT](https://github.com/shenwei356/seqkit/blob/master/LICENSE) - > + > Shen W, Le S, Li Y, Hu F. 2016. SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation. PLoS ONE 11(10): e0163962. doi: 10.1371/journal.pone.0163962 -- sortmerna, [GPL v3](https://github.com/sortmerna/sortmerna/blob/master/LICENSE.txt) +- SortMeRNA, [GPL v3](https://github.com/sortmerna/sortmerna/blob/master/LICENSE.txt) - > Kopylova E., Noé L. and Touzet H., "SortMeRNA: Fast and accurate filtering of ribosomal RNAs in metatranscriptomic data", Bioinformatics (2012), doi: . + > Kopylova E., Noé L. and Touzet H., "SortMeRNA: Fast and accurate filtering of ribosomal RNAs in metatranscriptomic data", Bioinformatics (2012), doi: 10.1093/bioinformatics/bts611 - STAR, [MIT](https://github.com/alexdobin/STAR/blob/master/LICENSE) - > Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013 Jan 1;29(1):15-21. doi: . Epub 2012 Oct 25. PMID: 23104886; PMCID: PMC3530905. + > Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013 Jan 1;29(1):15-21. doi: 10.1093/bioinformatics/bts635. Epub 2012 Oct 25. PMID: 23104886; PMCID: PMC3530905. - TSEBRA, [The Artistic License 2.0](https://github.com/Gaius-Augustus/TSEBRA/blob/main/bin/LICENSE.txt) - > Gabriel, L., Hoff, K.J., Brůna, T. et al. TSEBRA: transcript selector for BRAKER. BMC Bioinformatics 22, 566 (2021). + > Gabriel, L., Hoff, K.J., Brůna, T. et al. TSEBRA: transcript selector for BRAKER. BMC Bioinformatics 22, 566 (2021). doi: 10.1186/s12859-021-04482-0 ## Software packaging/containerisation tools diff --git a/subworkflows/local/utils_nfcore_genepal_pipeline/main.nf b/subworkflows/local/utils_nfcore_genepal_pipeline/main.nf index 1c04ef3..52c0325 100644 --- a/subworkflows/local/utils_nfcore_genepal_pipeline/main.nf +++ b/subworkflows/local/utils_nfcore_genepal_pipeline/main.nf @@ -435,33 +435,76 @@ def validateBamMetadata(metas, bams, permAssString) { // Generate methods description for MultiQC // -def toolCitationText() { +def toolCitationText(versions_yml) { + + def v_text = versions_yml.text.toLowerCase() + + def start_text = 'Tools used in the workflow included: ' + def end_text = ' and MultiQC (Ewels et al. 2016).' + def citation_text = [ - 'Tools used in the workflow included:', - 'AGAT (Dainat et al. 2024)', - 'BRAKER (Gabriel et al. 2023)', - 'BUSCO (Manni et al. 2021)', - 'EggNOG-mapper (Carlos et al. 2021)', - 'GffRead (Pertea et al. 2020)', - ].join(', ').trim() + ' and MultiQC (Ewels et al. 2016).' - - return citation_text + false ? '' : 'AGAT (Dainat et al. 2024)', + false ? '' : 'AUGUSTUS (Sommerfeld et al. 2009)', + false ? '' : 'BRAKER3 (Gabriel et al. 2023)', + ( ! v_text.contains('busco:') ) ? '' : 'BUSCO (Manni et al. 2021)', + ( ! v_text.contains('edta:') ) ? '' : 'EDTA (Ou et al. 2019)', + ( ! v_text.contains('eggnog-mapper:') ) ? '' : 'EggNOG-mapper (Carlos et al. 2021)', + ( ! v_text.contains('fastqc:') ) ? '' : 'FASTQC (Andrews. 2010)', + ( ! v_text.contains('fastp:') ) ? '' : 'FASTP (Chen et al. 2018)', + false ? '' : 'GeneMark-ETP (Brůna et al. 2024)', + false ? '' : 'GenomeTools (Gremme et al. 2013)', + ( ! v_text.contains('gffcompare:') ) ? '' : 'GffCompare (Pertea & Pertea. 2020)', + false ? '' : 'GffRead (Pertea & Pertea. 2020)', + ( ! v_text.contains('liftoff:') ) ? '' : 'Liftoff (Shumate & Salzberg. 2021)', + ( ! v_text.contains('orthofinder:') ) ? '' : 'OrthoFinder (Emms & Kelly. 2019)', + false ? '' : 'ProtHint (Brůna et al. 2020)', + false ? '' : 'py_fasta_validator (Edwards. 2019)', + ( ! v_text.contains('repeatmasker:') ) ? '' : 'RepeatMasker (Smit & Hubley. 2023)', + ( ! v_text.contains('repeatmodeler:') ) ? '' : 'RepeatModeler (Hubley. 2023)', + false ? '' : 'Samtools (Danecek et al. 2021)', + false ? '' : 'SeqKit (Shen et al. 2016)', + ( ! v_text.contains('sortmerna:') ) ? '' : 'SortMeRNA (Kopylova et al. 2012)', + ( ! v_text.contains('star:') ) ? '' : 'STAR (Dobin et al. 2013)', + false ? '' : 'TSEBRA (Gabriel et al. 2021)', + ].findAll { it != '' }.join(', ').trim() + + return start_text + citation_text + end_text } -def toolBibliographyText() { +def toolBibliographyText(versions_yml) { + + def v_text = versions_yml.text.toLowerCase() + def reference_text = [ - 'Jacques Dainat, Darío Hereñú, Dr. K. D. Murray, Ed Davis, Ivan Ugrin, Kathryn Crouch, LucileSol, Nuno Agostinho, pascal-git, Zachary Zollman, & tayyrov. (2024). NBISweden/AGAT: AGAT-v1.4.1 (v1.4.1). Zenodo. 10.5281/zenodo.13799920', - 'Gabriel, L., Bruna, T., Hoff, K. J., Ebel, M., Lomsadze, A., Borodovsky, M., Stanke, M. (2023). BRAKER3: Fully Automated Genome Annotation Using RNA-Seq and Protein Evidence with GeneMark-ETP, AUGUSTUS and TSEBRA. bioRxiV, doi: 10.1101/2023.06.10.544449.', - 'Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. 2021. BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes, Molecular Biology and Evolution, Volume 38, Issue 10, October 2021, Pages 4647–4654, 10.1093/molbev/msab199', - 'eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Carlos P. Cantalapiedra, Ana Hernandez-Plaza, Ivica Letunic, Peer Bork, Jaime Huerta-Cepas. 2021. Molecular Biology and Evolution, msab293, 10.1093/molbev/msab293', - 'Pertea G, Pertea M. GFF Utilities: GffRead and GffCompare. F1000Res. 2020 Apr 28;9:ISCB Comm J-304. doi: 10.12688/f1000research.23297.2. PMID: 32489650; PMCID: PMC7222033.', - 'Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics , 32(19), 3047–3048. doi: 10.1093/bioinformatics/btw354', - ].collect { it -> it != "" ? "
  • $it
  • " : '' }.join(' ').trim() + false ? '' : 'Jacques Dainat, Darío Hereñú, Dr. K. D. Murray, Ed Davis, Ivan Ugrin, Kathryn Crouch, LucileSol, Nuno Agostinho, pascal-git, Zachary Zollman, & tayyrov. (2024). NBISweden/AGAT: AGAT-v1.4.1 (v1.4.1). Zenodo. 10.5281/zenodo.13799920', + false ? '' : 'Sommerfeld, D., Lingner, T., Stanke, M., Morgenstern, B., & Richter, H. (2009). AUGUSTUS at MediGRID: Adaption of a bioinformatics application to grid computing for efficient genome analysis. Future Gener. Comput. Syst., 25, 337-345. doi: 10.1016/j.future.2008.05.010', + false ? '' : 'Gabriel, L., Bruna, T., Hoff, K. J., Ebel, M., Lomsadze, A., Borodovsky, M., Stanke, M. (2023). BRAKER3: Fully Automated Genome Annotation Using RNA-Seq and Protein Evidence with GeneMark-ETP, AUGUSTUS and TSEBRA. bioRxiV, doi: 10.1101/2023.06.10.544449', + ( ! v_text.contains('busco:') ) ? '' : 'Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. 2021. BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes, Molecular Biology and Evolution, Volume 38, Issue 10, October 2021, Pages 4647–4654, doi: 10.1093/molbev/msab199', + ( ! v_text.contains('edta:') ) ? '' : 'Ou S., Su W., Liao Y., Chougule K., Agda J. R. A., Hellinga A. J., Lugo C. S. B., Elliott T. A., Ware D., Peterson T., Jiang N., Hirsch C. N. and Hufford M. B. (2019). Benchmarking Transposable Element Annotation Methods for Creation of a Streamlined, Comprehensive Pipeline. Genome Biol. 20(1): 275. doi: 10.1186/s13059-019-1905-y', + ( ! v_text.contains('eggnog-mapper:') ) ? '' : 'eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Carlos P. Cantalapiedra, Ana Hernandez-Plaza, Ivica Letunic, Peer Bork, Jaime Huerta-Cepas. 2021. Molecular Biology and Evolution, msab293, doi: 10.1093/molbev/msab293', + ( ! v_text.contains('fastqc:') ) ? '' : 'Andrews, S. (2010). Babraham Bioinformatics - FastQC a quality control tool for high throughput sequence data. Babraham.ac.uk. url: https://www.bioinformatics.babraham.ac.uk/projects/fastqc', + ( ! v_text.contains('fastp:') ) ? '' : 'Chen S, Zhou Y, Chen Y, Gu J. 2018. fastp: an ultra-fast all-in-one FASTQ preprocessor, Bioinformatics, Volume 34, Issue 17, 01 September 2018, Pages i884–i890, doi: 10.1093/bioinformatics/bty560', + false ? '' : 'Brůna T, Lomsadze A, Borodovsky M. GeneMark-ETP significantly improves the accuracy of automatic annotation of large eukaryotic genomes. Genome Res. 2024 Jun 25;34(5):757-768. doi: 10.1101/gr.278373.123. PMID: 38866548; PMCID: PMC11216313.', + false ? '' : 'Gremme G, Steinbiss S, Kurtz S. 2013. "GenomeTools: A Comprehensive Software Library for Efficient Processing of Structured Genome Annotations," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 10, no. 3, pp. 645-656, May 2013, doi: 10.1109/TCBB.2013.68', + ( ! v_text.contains('gffcompare:') ) ? '' : 'Pertea G, Pertea M. GFF Utilities: GffRead and GffCompare. F1000Res. 2020 Apr 28;9:ISCB Comm J-304. doi: 10.12688/f1000research.23297.2. PMID: 32489650; PMCID: PMC7222033.', + ( ! v_text.contains('liftoff:') ) ? '' : 'Shumate A, Salzberg SL. Liftoff: accurate mapping of gene annotations. Bioinformatics. 2021 Jul 19;37(12):1639-1643. doi: 10.1093/bioinformatics/btaa1016. PMID: 33320174; PMCID: PMC8289374.', + ( ! v_text.contains('orthofinder:') ) ? '' : 'Emms, D.M., Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol 20, 238 (2019). doi: 10.1186/s13059-019-1832-y', + false ? '' : 'Tomáš Brůna, Alexandre Lomsadze, Mark Borodovsky, GeneMark-EP+: eukaryotic gene prediction with self-training in the space of genes and proteins, NAR Genomics and Bioinformatics, Volume 2, Issue 2, June 2020, lqaa026, doi: 10.1093/nargab/lqaa026', + false ? '' : 'Edwards, R.A. 2019. fasta_validate: a fast and efficient fasta validator written in pure C. doi: 10.5281/zenodo.2532044', + ( ! v_text.contains('repeatmasker:') ) ? '' : 'Smit, A., & Hubley, R. (2023). RepeatMasker: a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. Repeatmasker.org. url: https://www.repeatmasker.org', + ( ! v_text.contains('repeatmodeler:') ) ? '' : 'Hubley, R. (2023). RepeatModeler: a de novo transposable element (TE) family identification and modeling package. Repeatmasker.org. url: https://www.repeatmasker.org', + false ? '' : 'Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H. 2021. Twelve years of SAMtools and BCFtools, GigaScience, Volume 10, Issue 2, February 2021, giab008, doi: 10.1093/gigascience/giab008', + false ? '' : 'Shen W, Le S, Li Y, Hu F. 2016. SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation. PLoS ONE 11(10): e0163962. doi: 10.1371/journal.pone.0163962', + ( ! v_text.contains('sortmerna:') ) ? '' : 'Kopylova E., Noé L. and Touzet H., "SortMeRNA: Fast and accurate filtering of ribosomal RNAs in metatranscriptomic data", Bioinformatics (2012), doi: 10.1093/bioinformatics/bts611', + ( ! v_text.contains('star:') ) ? '' : 'Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013 Jan 1;29(1):15-21. doi: 10.1093/bioinformatics/bts635. Epub 2012 Oct 25. PMID: 23104886; PMCID: PMC3530905.', + false ? '' : 'Gabriel, L., Hoff, K.J., Brůna, T. et al. TSEBRA: transcript selector for BRAKER. BMC Bioinformatics 22, 566 (2021). doi: 10.1186/s12859-021-04482-0', + false ? '' : 'Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics , 32(19), 3047–3048. doi: 10.1093/bioinformatics/btw354', + ].collect { it -> it != '' ? "
  • $it
  • " : '' }.join(' ').trim() return reference_text } -def methodsDescriptionText(mqc_methods_yaml) { +def methodsDescriptionText(mqc_methods_yaml, versions_yml) { // Convert to a named map so can be used as with familar NXF ${workflow} variable syntax in the MultiQC YML file def meta = [:] meta.workflow = workflow.toMap() @@ -483,8 +526,8 @@ def methodsDescriptionText(mqc_methods_yaml) { meta["tool_citations"] = "" meta["tool_bibliography"] = "" - meta["tool_citations"] = toolCitationText().replaceAll(", \\.", ".").replaceAll("\\. \\.", ".").replaceAll(", \\.", ".") - meta["tool_bibliography"] = toolBibliographyText() + meta["tool_citations"] = toolCitationText(versions_yml).replaceAll(", \\.", ".").replaceAll("\\. \\.", ".").replaceAll(", \\.", ".") + meta["tool_bibliography"] = toolBibliographyText(versions_yml) def methods_text = mqc_methods_yaml.text diff --git a/workflows/genepal.nf b/workflows/genepal.nf index 15042f6..56661c9 100644 --- a/workflows/genepal.nf +++ b/workflows/genepal.nf @@ -282,7 +282,10 @@ workflow GENEPAL { ch_workflow_summary = Channel.value( paramsSummaryMultiqc ( paramsSummaryMap(workflow, parameters_schema: "nextflow_schema.json") ) ) | collectFile(name: 'workflow_summary_mqc.yaml') - ch_methods_description = Channel.value( methodsDescriptionText ( file("$projectDir/assets/methods_description_template.yml", checkIfExists: true) ) ) + ch_methods_description = ch_versions_yml + | map { versions_yml -> + methodsDescriptionText ( file("$projectDir/assets/methods_description_template.yml", checkIfExists: true), versions_yml ) + } | collectFile(name: 'methods_description_mqc.yaml', sort: true) ch_multiqc_extra_files = Channel.empty()