-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PD-2435 Test bwa-mem2 step and run Intel distributed BWA-MEM2 #1147
Conversation
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @aawdeh! This is great!
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
pipelines/skylab/multiome/atac.wdl
Outdated
String output_base_name | ||
String docker_image = "us.gcr.io/broad-gotc-prod/samtools-bwa-mem-2:1.0.0-2.2.1_x64-linux-1685469504" | ||
String docker_image = "us.gcr.io/broad-gotc-prod/samtools-dist-bwa:aa-dist-bwa-" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pin this to an official version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it may be best to pin the official version with a number instead of the branch name -- just in case it changes. I am rebuilding the docker image with the tag 1.0.0
. Ill change that in the wdl to reflect that.
pipelines/skylab/multiome/atac.wdl
Outdated
|
||
# move bam file to /cromwell_root | ||
mv ~{bam_aligned_output_name} /cromwell_root | ||
ls /cromwell_root |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could probably remove this ls command
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
Remember to squash merge! |
* update paths to input files * update jg sample map * Km buildindices docs (#1158) * add buildindices overview doc and diagram * Km rnawithumis and ss2 doc updates (#1157) * update rnawithumis overview * Update rna-with-umis.methods.md * Update rna-with-umis.methods.md * update multi-snSS2 readme * Update multi_snss2.methods.md * Update multi_snss2.methods.md * update multi-snSS2 docs * update SS2 overview doc * fix python script link * Lk pd2448 upstools (#1150) Added paired-tag wrapper and demultiplexing task * PD-2435 Test bwa-mem2 step and run Intel distributed BWA-MEM2 (#1147) * Lk pd2453 add bb tag (#1161) Added option to incorporate BB tag in BAM and use it in SnapATAC2 software. * km paired-tag docs (#1165) * update overview docs - update pipeline version numbers in Multiome and Optimus Overview docs * update multi-SS2 overview doc * Update smart-seq2.methods.md * update multi-SS2 methods doc * Update doc_style.md * add paired-tag overview doc * Update website/docs/Pipelines/PairedTag_Pipeline/README.md Co-authored-by: ekiernan <[email protected]> * Apply suggestions from LK doc review Co-authored-by: ekiernan <[email protected]> --------- Co-authored-by: ekiernan <[email protected]> * Np jprb pd 2353 multiple star rsolo align (#1164) * integrate multiple soloFeatures * updating counting_mode definition * logic * count exons is true * count exons is true * fix the logic * count exons false * count exons false * echos * echos * echos * rearrange logic * rearrange * testing * testing * testing * count exons true * switch counting mode order * try running in scrna * clean up * snrna countexons is true * snrna countexons is true * snrna countexons is false * snrna countexons is true * snrna countexons is false * cleaning up * changelogs * changelogs * change cpu_platform to Intel Cascade Lake for sci test input * change cpu_platform to Intel Cascade Lake for sci test input * Update pipelines/skylab/smartseq2_single_nucleus_multisample/MultiSampleSmartSeq2SingleNucleus.changelog.md Co-authored-by: ekiernan <[email protected]> * Update pipelines/skylab/multiome/Multiome.changelog.md Co-authored-by: ekiernan <[email protected]> * Update pipelines/skylab/optimus/Optimus.changelog.md Co-authored-by: ekiernan <[email protected]> * Update pipelines/skylab/paired_tag/PairedTag.changelog.md Co-authored-by: ekiernan <[email protected]> * Update pipelines/skylab/slideseq/SlideSeq.changelog.md Co-authored-by: ekiernan <[email protected]> --------- Co-authored-by: Juan Pablo Ramos Barroso <[email protected]> Co-authored-by: ekiernan <[email protected]> * Np update multiome sci test (#1167) * add summary task * change cpu_platform to Intel Cascade Lake for sci test input * change cpu_platform to Intel Cascade Lake for sci test input * change cpu_platform to Intel Cascade Lake for sci test input * Update VerifyTasks.wdl * Update VerifyTasks.wdl * PD-2422 BICAN_Optimus_2nymxis_Oct_2023 (#1152) * Np multimapper param starsolo (#1172) * add summary task * add multimapper option * update optimus plumbing for ease of testing * add echos * add to test * remove some echoes * make mouse snrna json go back to what is in dev * make mouse snrna json go back to what is in dev * add as outputs * typo * changelogs * changelogs * changelogs * update pipeline docs * optional output * optional output * optional output * optional output * docs * docs * docs * Update website/docs/Pipelines/Optimus_Pipeline/README.md Co-authored-by: Kaylee Mathews <[email protected]> * remove optional input to tests --------- Co-authored-by: kayleemathews <[email protected]> Co-authored-by: Kaylee Mathews <[email protected]> * added exit code to CompareTabix (#1174) Updated the CompareTabix task in the Verify tasks * Lk pd 2452 add read length check (#1171) * adding read2 length and barcode orientation check task * Lk pd 2464 batch methylome (#1181) Added scatter and preemptibles to snM3C * Np edit resources needed for bwa task / add logic to compareBams (#1183) * add summary task * get a bigger machine * more memory * trying out east1 * go back to central and decrease mem and threads * try different zones * make machine smaller * smaller cpu * smaller cpu * more mem * more mem * 2000 disk * more mem compare bams * more mem compare bams * more mem compare bams * no zones * more mem in comparebams * more mem in comparebams * 725000 * 825000 * 725 * max records 3000000 * max records 3000000 * add logic to fail fast if bams differ in size by 200 mb * Update VerifyTasks.wdl * PD-2483 (#1182) * rc-2483 * update changelog * Update README.md --------- Co-authored-by: kayleemathews <[email protected]> Co-authored-by: ekiernan <[email protected]> * PD-2476: Add task before fastqprocess to find number of splits (#1178) * removed space (#1184) Removed space from Verify Tabix task * Lk pd 2455 pairedtag parsebarcodes (#1186) Added a task to parse cell barcodes from sample barcodes * Np move snm3c from beta pipelines (#1185) * add summary task * move CondensedSnm3C.wdl out of beta * update batch numbers * add sorting to the compare compressed text files * changelogs and versions * revert batch change * batch change * batch change * Km snm3c overview doc (#1179) * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update website/docs/Pipelines/snM3C/README.md Co-authored-by: Nikelle Petrillo <[email protected]> * Update README.md * fix num_downstr_bases description * Update README.md --------- Co-authored-by: Nikelle Petrillo <[email protected]> * Np fix snm3c test (#1190) * add summary task * batch change in test wdl --------- Co-authored-by: Kaylee Mathews <[email protected]> Co-authored-by: ekiernan <[email protected]> Co-authored-by: aawdeh <[email protected]> Co-authored-by: Nikelle Petrillo <[email protected]> Co-authored-by: Juan Pablo Ramos Barroso <[email protected]> Co-authored-by: kayleemathews <[email protected]> Co-authored-by: Robert Sidney Cox III <[email protected]>
This refers to ticket: https://broadworkbench.atlassian.net/browse/PD-2435
This is to test out running the distributed bwa-mem2 code from intel in Terra. Code can be found here: https://github.com/IntelLabs/Open-Omics-Acceleration-Framework
Updated docker of this repo can be found here:
us.gcr.io/broad-gotc-prod/samtools-dist-bwa:1.0.0
List of updates for PR:
num_output_files
as input parameter in the ATAC wdl. Set this parameter to 4.MergedBAM
and movedBWAPairedEndAlignment
outside of the for loop.BWAPairedEndAlignment
task include:(a) An array of fastq R1 and R3 files is now taken as input
(b) Updated docker for distributed bwa-mem2 code is now used.
(c) Intel Ice Lake 512GB machine used.
(d) Configuration file is set -- this includes input parameters such as R1, R3, Ref, etc
(e) Added additional output parameter to include the logs of the multiple bwa-mem2 runs.
Checklist
If you can answer "yes" to the following items, please add a checkmark next to the appropriate checklist item(s) and notify our WARP documentation team by tagging either @ekiernan or @kayleemathews in a comment on this PR.