Skip to content

Commit

Permalink
feat!: replace all bbtool wrappers with one bbtool wrapper for all su…
Browse files Browse the repository at this point in the history
…bcommands (#1322)

<!-- Ensure that the PR title follows conventional commit style (<type>:
<description>)-->
<!-- Possible types are here:
https://github.com/commitizen/conventional-commit-types/blob/master/index.json
-->

### Description

I add a generic bbtool wrapper. 
It is based on #2 

BBtools hase many bash shcripts. This wrapper should allow to run them
all.

### QC
<!-- Make sure that you can tick the boxes below. -->

* [x] I confirm that:

For all wrappers added by this PR, 

* there is a test case which covers any introduced changes,
* `input:` and `output:` file paths in the resulting rule can be changed
arbitrarily,
* either the wrapper can only use a single core, or the example rule
contains a `threads: x` statement with `x` being a reasonable default,
* rule names in the test case are in
[snake_case](https://en.wikipedia.org/wiki/Snake_case) and somehow tell
what the rule is about or match the tools purpose or name (e.g.,
`map_reads` for a step that maps reads),
* all `environment.yaml` specifications follow [the respective best
practices](https://stackoverflow.com/a/64594513/2352071),
* wherever possible, command line arguments are inferred and set
automatically (e.g. based on file extensions in `input:` or `output:`),
* all fields of the example rules in the `Snakefile`s and their entries
are explained via comments (`input:`/`output:`/`params:` etc.),
* `stderr` and/or `stdout` are logged correctly (`log:`), depending on
the wrapped tool,
* temporary files are either written to a unique hidden folder in the
working directory, or (better) stored where the Python function
`tempfile.gettempdir()` points to (see
[here](https://docs.python.org/3/library/tempfile.html#tempfile.gettempdir);
this also means that using any Python `tempfile` default behavior
works),
* the `meta.yaml` contains a link to the documentation of the respective
tool or command,
* `Snakefile`s pass the linting (`snakemake --lint`),
* `Snakefile`s are formatted with
[snakefmt](https://github.com/snakemake/snakefmt),
* Python wrapper scripts are formatted with
[black](https://black.readthedocs.io).
* Conda environments use a minimal amount of channels, in recommended
ordering. E.g. for bioconda, use (conda-forge, bioconda, nodefaults, as
conda-forge should have highest priority and defaults channels are
usually not needed because most packages are in conda-forge nowadays).

---------

Co-authored-by: Filipe G. Vieira <[email protected]>
Co-authored-by: David Laehnemann <[email protected]>
  • Loading branch information
3 people authored Nov 14, 2023
1 parent 61f7ab3 commit 6eb3c22
Show file tree
Hide file tree
Showing 28 changed files with 809 additions and 389 deletions.
18 changes: 0 additions & 18 deletions bio/bbtools/bbduk/meta.yaml

This file was deleted.

38 changes: 0 additions & 38 deletions bio/bbtools/bbduk/test/Snakefile

This file was deleted.

4 changes: 0 additions & 4 deletions bio/bbtools/bbduk/test/reads/pe/a.1.fastq

This file was deleted.

4 changes: 0 additions & 4 deletions bio/bbtools/bbduk/test/reads/pe/a.2.fastq

This file was deleted.

4 changes: 0 additions & 4 deletions bio/bbtools/bbduk/test/reads/se/a.fastq

This file was deleted.

50 changes: 0 additions & 50 deletions bio/bbtools/bbduk/wrapper.py

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@ channels:
- nodefaults
dependencies:
- bbmap =39.01
- snakemake-wrapper-utils =0.5.3
- snakemake-wrapper-utils =0.6.2
- pigz=2
7 changes: 0 additions & 7 deletions bio/bbtools/loglog/environment.yaml

This file was deleted.

12 changes: 0 additions & 12 deletions bio/bbtools/loglog/meta.yaml

This file was deleted.

26 changes: 0 additions & 26 deletions bio/bbtools/loglog/test/Snakefile

This file was deleted.

4 changes: 0 additions & 4 deletions bio/bbtools/loglog/test/reads/pe/a.1.fastq

This file was deleted.

4 changes: 0 additions & 4 deletions bio/bbtools/loglog/test/reads/pe/a.2.fastq

This file was deleted.

4 changes: 0 additions & 4 deletions bio/bbtools/loglog/test/reads/se/a.fastq

This file was deleted.

26 changes: 0 additions & 26 deletions bio/bbtools/loglog/wrapper.py

This file was deleted.

138 changes: 138 additions & 0 deletions bio/bbtools/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
name: Generic
url: https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/
description: |
Generic wrapper for BBTools. One pattern to run them all.
authors:
- Silas Kieser
input:
- input: All keywords in input are passed to the bbtool as key=value pairs. A special case is 'input', which is translated to 'in'. If exactly two files are provides for 'input' the wrapper parses them as in1, in2.
- flag: The keyword 'flag' is ignored and allows to specify input files that are not processed by the bbtool that is used.
output:
- out: If exactly two files are provides for 'out' the wrapper parses them as out1, out2.
- outm: Same as 'out'
- outu: Same as 'out'
- flag: The keyword 'flag' is not passed to the command.
params:
- command: Required parameter defining the command to be used e.g. 'bbmap.sh'
- extra: additional program arguments. All other parameters are passed to the bbtool as key=value pairs
notes: |
This wrapper allows to run any of the bbtools. The command is defined by the 'command' parameter, which is required.
**All keywords in input, output, params are passed as key value pairs to the command!**
Take care that they are valid for the bbtool.
As it is not possible to define 'in' as a keyword, the keyword 'input' is used instead. Allowed aliases are 'sample' and 'reads'.
### Paired input/output files
If exactly two files are provides for 'input' the wrapper parses them as in1, in2.
The same holds for the output keywords 'out', 'outm', 'outu'.
For all parameters, if more than two files are provided, the wrapper parses them as key=value1,value2,value3...
### Logging
The wrapper makes a detailed log of how he parse the parameters. If you want to use the log of a bbtool,
e.g. for parsing how many reads where processed, you have to specify a 'stdout', and 'stderr' log file.
The wrapper will then write only to the stderr-log file.
### List of all the tools in the bbtools suite 39.01
In theory all of them are sported by this wrapper, but we didn't test them all.
Scripts with different input/output might not be supported by this wrapper.
If you find one that is not yet supported, please feel free to adjust this wrapper accordingly and include a test case.
- bbmap.sh
- removehuman.sh
- removehuman2.sh
- mapnt.sh
- mapPacBio.sh
- bbmapskimmer.sh
- bbsplit.sh
- bbwrap.sh
- pileup.sh
- summarizescafstats.sh
- filterbycoverage.sh
- mergeOTUs.sh
- bbest.sh
- postfilter.sh
- bbduk.sh
- bbduk2.sh
- seal.sh
- summarizeseal.sh
- loglog.sh
- kmercountexact.sh
- bbnorm.sh
- ecc.sh
- khist.sh
- bbcountunique.sh
- commonkmers.sh
- kmercoverage.sh
- callpeaks.sh
- tadpole.sh
- tadwrapper.sh
- kcompress.sh
- stats.sh
- statswrapper.sh
- countgc.sh
- fungalrelease.sh
- filterbytaxa.sh
- gi2taxid.sh
- gitable.sh
- sortbytaxa.sh
- splitbytaxa.sh
- taxonomy.sh
- taxtree.sh
- reducesilva.sh
- synthmda.sh
- crosscontaminate.sh
- decontaminate.sh
- crossblock.sh
- dedupe.sh
- dedupe2.sh
- dedupebymapping.sh
- clumpify.sh
- bbmerge.sh
- bbmerge-auto.sh
- bbmergegapped.sh
- randomreads.sh
- bbfakereads.sh
- gradesam.sh
- samtoroc.sh
- addadapters.sh
- grademerge.sh
- printtime.sh
- msa.sh
- cutprimers.sh
- idmatrix.sh
- matrixtocolumns.sh
- countbarcodes.sh
- filterbarcodes.sh
- mergebarcodes.sh
- removebadbarcodes.sh
- demuxbyname.sh
- filterbysequence.sh
- filterbyname.sh
- filtersubs.sh
- getreads.sh
- estherfilter.sh
- bbqc.sh
- rqcfilter.sh
- shred.sh
- fuse.sh
- shuffle.sh
- calcmem.sh
- textfile.sh
- countsharedlines.sh
- filterlines.sh
- a_sample_mt.sh
- bbmask.sh
- calctruequality.sh
- makechimeras.sh
- phylip2fasta.sh
- readlength.sh
- reformat.sh
- removesmartbell.sh
- rename.sh
- repair.sh
- bbsplitpairs.sh
- splitnextera.sh
- splitsam.sh
- testformat.sh
- translate6frames.sh
7 changes: 0 additions & 7 deletions bio/bbtools/tadpole/environment.yaml

This file was deleted.

17 changes: 0 additions & 17 deletions bio/bbtools/tadpole/meta.yaml

This file was deleted.

Loading

0 comments on commit 6eb3c22

Please sign in to comment.