Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework output directory #12

Open
jackscanlan opened this issue Jul 26, 2024 · 0 comments
Open

Rework output directory #12

jackscanlan opened this issue Jul 26, 2024 · 0 comments
Assignees
Labels
rework Redoing or refining something

Comments

@jackscanlan
Copy link
Collaborator

jackscanlan commented Jul 26, 2024

./output directory currently has the following structure (after a run):

output
├── logs
│   ├── K739J
│   └── K77JP
├── modules
│   ├── assignment_plot
│   ├── dada_mergereads
│   ├── dada_priors
│   ├── denoise
│   ├── error_model
│   ├── filter_qualplots
│   ├── filter_seqtab
│   ├── joint_tax
│   ├── merge_tax
│   ├── parse_inputs
│   ├── phyloseq_filter
│   ├── phyloseq_merge
│   ├── phyloseq_unfiltered
│   ├── primer_trim
│   ├── read_filter
│   ├── read_tracking
│   ├── split_loci
│   ├── tax_blast
│   ├── tax_idtaxa
│   ├── tax_summary
│   └── tax_summary_merge
├── rds
├── results
│   ├── filtered
│   └── unfiltered
└── temp

logs, rds, results and temp don't currently get used by the pipeline. modules is used to save all output channel files from every process based on module name, but this was largely used during initial development to easily check outputs of each process without diving into the work directory, and it's not easy for users to find relevant output files unless they know the structure of the pipeline well.

First thoughts:

  1. Remove logs, temp and rds
  2. Create a directory (maybe called rdata or rda) to store the .rda files produced when --rdata true is used, so don't have to dive into work directories; maybe create this directory dynamically during the run if --rdata true
  3. Use results to save filtered and unfiltered output files like the original pipeline, but also have folders for the QC plots
  4. Have a pipeline_info folder like nf-core pipelines that contains the trace, DAG, report and timeline files all about the pipeline execution
@jackscanlan jackscanlan added the rework Redoing or refining something label Jul 26, 2024
@jackscanlan jackscanlan self-assigned this Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rework Redoing or refining something
Projects
None yet
Development

No branches or pull requests

1 participant