Memory issues with WFA2 and BiWFA while processing millions of paired alignments #97

GSbioinfo · 2024-07-13T22:19:54Z

Hi WFA2 and BiWFA team,

Thank you for making these fast and efficient libraries available for everyone.
I am trying to implement the library in my project and has been successful for small data with 2-3 million read but when I am using this for large data with >5 millions of reads it through out segmentation fault. I have 128GB ram on the system. After testing different 'attributes.memory_mode' setting I found out that the BiWFA runs out of memory after processing certain number of runs and through out segmentation fault error.

Project I am working on involves doing pairwise comparison of n millions of DNA queries ( illumina reads) to m different reference sequences (amplicons). I am calling function

std::string nw_function(std::string refseq, std::string query){
char *pattern;
char text;
pattern = &refseq[0];
text = &query[0];
// Configure alignment attributes
wavefront_aligner_attr_t attributes = wavefront_aligner_attr_default;
attributes.distance_metric = gap_affine;
attributes.alignment_form.span = alignment_end2end;// alignment_end2end;
attributes.affine_penalties.match = 0;
attributes.affine_penalties.mismatch = 4;
attributes.affine_penalties.gap_opening = 20;
attributes.affine_penalties.gap_extension = 2;
attributes.memory_mode = wavefront_memory_ultralow;
// Initialize Wavefront Aligner
wavefront_aligner_t const wf_aligner = wavefront_aligner_new(&attributes);
// Align
wavefront_bialign(wf_aligner,pattern,refseq.length(),text,refseq.length());
std::string mycig = get_cigar_string(wf_aligner->cigar,true);
// Free
wavefront_aligner_delete(wf_aligner);
return mycig;
}

I tried using your WFA library and encountered similar issues with much lower read processing capacity. For this reason I moved to your BiWFA library which significantly improved the read capacity but not enough to solve the problem. I was hoping if you could help identify solution to the problem I am facing. Can you give some idea about what parameters I would need to modify so BiWFA does not run out of memory.
Greatly appreciate your help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory issues with WFA2 and BiWFA while processing millions of paired alignments #97

Memory issues with WFA2 and BiWFA while processing millions of paired alignments #97

GSbioinfo commented Jul 13, 2024

Memory issues with WFA2 and BiWFA while processing millions of paired alignments #97

Memory issues with WFA2 and BiWFA while processing millions of paired alignments #97

Comments

GSbioinfo commented Jul 13, 2024