Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

classify_trycycler does not output barcode08_for_reconciliation/* #65

Open
fredjaya opened this issue Dec 18, 2024 · 0 comments
Open

classify_trycycler does not output barcode08_for_reconciliation/* #65

fredjaya opened this issue Dec 18, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@fredjaya
Copy link
Member

fredjaya commented Dec 18, 2024

The error

In /scratch/tj48/fj9712/ONT-bacpac-nf/work/88/c1dc55defedb0c4679fb8961ad6ae0

Caused by:
  Missing output file(s) `barcode08_for_reconciliation/*` expected by process `classify_trycycler (CLASSIFYING CONTIGS: barcode08)`

Description:

barcode08 has three clusters each with a single contig.

classify_trycycler_clusters.py correctly discards all these because a "good cluster" should have a contig from each assembler.

When all clusters are discarded, the cluster dir is moved to ${barcode}_discarded/ and the required ${barcode}_for_reconciliation/ folder is never created due to improper handling of cases in the elifs.

Solution

Ideally replace classify_trycycler_clusters.py entirely. A suitable replacement will have the following features:

  • Does not move any files
  • Does not create any intermediate folders
  • Outputs an assignment of pass/fail per cluster as a Nextflow/groovy tuple e.g.
[barcode01, cluster01, pass]
[barcode01, cluster02, fail]
[barcode02, cluster01, pass]
...

Some pseudocode for a single barcode:

  1. Input: path to directory containing trycycler clusters
  2. For each cluster, count the number of contigs per assembly. Maybe tabulate e.g.
| cluster | assemblies |
| ------- | ---------- | 
| 01      | A, B       | 
| 02      | A          | 
| 03      | B          | 
| 04      | B, B       |
| 05      | A, A, B    | 
  1. Decide whether each cluster should be reconciled or not
| cluster | assemblies | decision  |
| ------- | ---------- | --------- |
| 01      | A, B       | reconcile |
| 02      | A          | discard   |
| 03      | B          | discard   |
| 04      | B, B       | discard   |
| 05      | A, A, B    | ?         |
  1. Output as Nextflow tuple
[barcode01, cluster01, reconcile]
[barcode01, cluster02, discard]
...

or simply a channel with clusters to reconcile

[barcode01, cluster01]
@fredjaya fredjaya added the bug Something isn't working label Dec 18, 2024
@fredjaya fredjaya self-assigned this Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant