Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory issues with WFA2 and BiWFA while processing millions of paired alignments #97

Open
GSbioinfo opened this issue Jul 13, 2024 · 0 comments

Comments

@GSbioinfo
Copy link

Hi WFA2 and BiWFA team,

Thank you for making these fast and efficient libraries available for everyone.
I am trying to implement the library in my project and has been successful for small data with 2-3 million read but when I am using this for large data with >5 millions of reads it through out segmentation fault. I have 128GB ram on the system. After testing different 'attributes.memory_mode' setting I found out that the BiWFA runs out of memory after processing certain number of runs and through out segmentation fault error.

Project I am working on involves doing pairwise comparison of n millions of DNA queries ( illumina reads) to m different reference sequences (amplicons). I am calling function

std::string nw_function(std::string refseq, std::string query){
char *pattern;
char text;
pattern = &refseq[0];
text = &query[0];
// Configure alignment attributes
wavefront_aligner_attr_t attributes = wavefront_aligner_attr_default;
attributes.distance_metric = gap_affine;
attributes.alignment_form.span = alignment_end2end;// alignment_end2end;
attributes.affine_penalties.match = 0;
attributes.affine_penalties.mismatch = 4;
attributes.affine_penalties.gap_opening = 20;
attributes.affine_penalties.gap_extension = 2;
attributes.memory_mode = wavefront_memory_ultralow;
// Initialize Wavefront Aligner
wavefront_aligner_t const wf_aligner = wavefront_aligner_new(&attributes);
// Align
wavefront_bialign(wf_aligner,pattern,refseq.length(),text,refseq.length());
std::string mycig = get_cigar_string(wf_aligner->cigar,true);
// Free
wavefront_aligner_delete(wf_aligner);
return mycig;
}

I tried using your WFA library and encountered similar issues with much lower read processing capacity. For this reason I moved to your BiWFA library which significantly improved the read capacity but not enough to solve the problem. I was hoping if you could help identify solution to the problem I am facing. Can you give some idea about what parameters I would need to modify so BiWFA does not run out of memory.
Greatly appreciate your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant