You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue describes a new feature in pandora to handle huge indexes with few GBs of RAM. The concrete example we have is an index with 186k PRGs, mostly linear. The main use case accounts for almost 1M PRGs. For this "small" example with 186k PRGs, running pandora compare with reads from 114 samples results in only 13.7k genes actually being found and being in the final multisample matrix/vcf (7.3%). pandora takes 15.6 GB of RAM to run compare in this case, but could possibly do it with just a fraction of this RAM if it loaded the index just for the relevant 13.7k genes, instead of all 186k genes. RAM usage will be much higher for 1M PRGs, and we want to keep this runnable for common user desktops, i.e. at most 13 or 14 GB of usage. For this use case, we have a fixed vcf-ref for each PRG, so we could also run pandora compare (or map in this case) per sample and merge results later. This feature is particularly important for running pandora compare/map for one sample, as even less genes will be loaded.
The text was updated successfully, but these errors were encountered:
This issue describes a new feature in pandora to handle huge indexes with few GBs of RAM. The concrete example we have is an index with 186k PRGs, mostly linear. The main use case accounts for almost 1M PRGs. For this "small" example with 186k PRGs, running
pandora compare
with reads from 114 samples results in only 13.7k genes actually being found and being in the final multisample matrix/vcf (7.3%).pandora
takes 15.6 GB of RAM to runcompare
in this case, but could possibly do it with just a fraction of this RAM if it loaded the index just for the relevant 13.7k genes, instead of all 186k genes. RAM usage will be much higher for 1M PRGs, and we want to keep this runnable for common user desktops, i.e. at most 13 or 14 GB of usage. For this use case, we have a fixed vcf-ref for each PRG, so we could also runpandora compare
(ormap
in this case) per sample and merge results later. This feature is particularly important for runningpandora compare/map
for one sample, as even less genes will be loaded.The text was updated successfully, but these errors were encountered: