Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exploring removing path from pandora index and on-demanding loading of PRGs #309

Open
leoisl opened this issue Nov 17, 2022 · 0 comments
Open

Comments

@leoisl
Copy link
Collaborator

leoisl commented Nov 17, 2022

Removing path from pandora indexes don't improve much the RAM usage:

  • Commit: 6569d90
  • RAM usage reduction: from 15.7GB to 13.4GB (15% reduction);
  • Results are not identical neither, it is hard to compare, but a quick comparison shows this:
    • DB size: 188k PRGs
    • Nb samples: 114
    • Nb of PRGs in final pandora_multisample.matrix BEFORE optimisation: 13716
    • Nb of PRGs in final pandora_multisample.matrix AFTER optimisation: 13756
    • Nb of common lines in pandora_multisample.matrix (i.e. where both versions found same presence/absence pattern): 9356
    • Nb of diff lines in pandora_multisample.matrix: 4400

I don't think is worth it to try removing paths from pandora indexes right now. Note that 13k genes were mapped to out of 188k. The on-demand PRG loading will be where we will really save RAM. Maybe later we can remove path from the index..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant