-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix join barcode step for Opossum #1397
Conversation
…mosomes larger than 512 Mbp.
Remember to squash merge! |
🔍Changelog Validation Results:
|
🔍Version Validation Results:
|
@@ -1,3 +1,7 @@ | |||
# 5.7.2 | |||
2024-10-21 (Date of Last Commit) | |||
* Changed a flag in H5adUtils.wdl to se CSI instead of TBI indexing in tabix command to support chromosomes larger than 512 Mbp. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Changed a flag in H5adUtils.wdl to se CSI instead of TBI indexing in tabix command to support chromosomes larger than 512 Mbp. | |
* Changed a flag in H5adUtils.wdl to set CSI instead of TBI indexing in tabix command to support chromosomes larger than 512 Mbp. |
Remember to squash merge! |
🔍Changelog Validation Results:
|
🔍Version Validation Results:
|
Remember to squash merge! |
🔍Changelog Validation Results:
|
🔍Version Validation Results:
|
…warp into pd-2747-multiome-opossum
Remember to squash merge! |
🔍Changelog Validation Results:
|
🔍Version Validation Results:
|
Remember to squash merge! |
🔍Changelog Validation Results:
|
🔍Version Validation Results:
|
🔍Version Validation Results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to grab the correct output
Remember to squash merge! |
🔍Changelog Validation Results:
|
🔍Version Validation Results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The h5ad files look correct; I have a ticket in next Sprint to fix the Verify task for comparing them to handle the non-determinism better.
Remember to squash merge! |
🔍Version Validation Results:
|
🔍Changelog Validation Results:
|
Remember to squash merge! |
🔍Version Validation Results:
|
🔍Changelog Validation Results:
|
changed JoinBarcodes task index file name
Remember to squash merge! |
🔍Version Validation Results:
|
🔍Changelog Validation Results:
|
Co-authored-by: Elizabeth Kiernan <[email protected]>
Remember to squash merge! |
🔍Version Validation Results:
|
🔍Changelog Validation Results:
|
Fix tabix indexing for large chromosomes
Issue
The previous tabix command was failing due to limitations in TBI (tabix) index
format when dealing with large chromosomes (>512 Mbp). The error message was:
"Region 536870658..536870963 cannot be stored in a tbi index. Try using a csi index"
Fix
Modified the tabix command to use CSI (Coordinate-Sorted Index) instead of TBI.
CSI can handle much larger chromosome sizes.
Changed from:
tabix -s 1 -b 2 -e 3 "${atac_fragment_base}.sorted.tsv.gz"
To:
tabix -s 1 -b 2 -e 3 -C "${atac_fragment_base}.sorted.tsv.gz"
The -C flag tells tabix to create a CSI index. This allows indexing of larger
genomic regions while maintaining the specific column structure of our file.
Impact
This change allows successful indexing of ATAC fragment files that contain
genomic regions larger than 512 Mbp, improving compatibility with a wider
range of reference genomes and ensuring the pipeline can handle larger
chromosomes without failing at the indexing step.