Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing merged bam results from top level directory #10

Open
kevinmhadi opened this issue Feb 13, 2025 · 2 comments
Open

Missing merged bam results from top level directory #10

kevinmhadi opened this issue Feb 13, 2025 · 2 comments
Assignees

Comments

@kevinmhadi
Copy link

Describe the bug
Merged bam results seem to be missing from top level alignment and/or parabricks directories?

Environment (please complete the following information):

  • NYU
  • /gpfs/data/imielinskilab/projects/Clinical_NYU/Nextflow/2025-01-27___Batch003_Heme/
@shihabdider
Copy link
Contributor

Not sure why this is happening.

My first hunch was that it might have something to do with the save_mapped flag (which indicates whether to save intermediate bams) but you have it set to true in the config. So that can't be it.

The merged bams and their bais should be emitted to alignment/ as indicated by this code snippet:

withName: 'MERGE_BAM|INDEX_MERGE_BAM' {
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/alignment/" },
pattern: "*{bam,bai}",
// Only save if (save_output_as_bam AND (no_markduplicates OR save_mapped ))
saveAs: { (params.save_output_as_bam && (params.save_mapped || params.skip_tools && params.skip_tools.split(',').contains('markduplicates'))) ? "mapped/${meta.id}/${it}" : null }
]
}

The only thing I can think of is that this process was never run, which means the merged bams were never produced.

However, if you found the merged bam in the work directory, can you please post the path of that directory?

@shihabdider
Copy link
Contributor

This is the bit of code which is doing the merging:

bam_mapped = bam_mapped
.map { meta, bam ->
// Update meta.id to be meta.sample, ditching sample-lane that is not needed anymore
// Update meta.data_type
// Remove no longer necessary fields:
// read_group: Now in the BAM header
// num_lanes: only needed for mapping
// size: only needed for mapping
// Ensure meta.size and meta.num_lanes are integers and handle null values
int numLanes = (meta.num_lanes != null && meta.num_lanes > 1 ? meta.num_lanes : 1) as int
int numSplits = (meta.size ?: 1) as int
int numReads = numLanes * numSplits
// Use groupKey to make sure that the correct group can advance as soon as it is complete
// and not stall the workflow until all reads from all channels are mapped
[ groupKey( meta - meta.subMap('num_lanes', 'read_group', 'size') + [ id:meta.sample ], numReads), bam ]
}
.groupTuple()
// bams are merged (when multiple lanes from the same sample) and indexed
BAM_MERGE_INDEX_SAMTOOLS(bam_mapped)

My suspicion is that the lanes aren't being interpreted correctly so the grouping (which is necessary to trigger merging) never happened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants