You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, thanks for this great tool.
Just a question:
I wonder how to select the appropriate reference for a set of (diverse) genomes.
When I run the referenceseeker in this case, it gives different reference for each genome.
The text was updated successfully, but these errors were encountered:
Hi @MostafaYA,
thanks for this excellent question! This is indeed an interesting use case and we already started to work on a solution for that. However, this will still take a while. Maybe we can provide a solution for that at the end of this year .
@oschwengers any update on that work? I'm wondering what the best approach would be here? Two passes, the first that finds all candidates for all samples and the second that computes distance to each of these candidates and finds the one with the lowest average distance?
Thanks @pvanheus for bringing this up again. Actually, this just slipped down my priority list. But if there is still a need for and interest in that, I would try to work on this as a side-side project. Unfortunately, I cannot make any reliable commitments to this right now.
Regarding the WF: right as you mentioned: First we have to calculate approx. genome distances (for instance Mash) as a rough estimate to select reference candidates. Then we have to compute ANI between all query and reference candidates and then rank & select these references. The main task we tried to work on is how to best rank the reference genomes as ANI difference of course can differ a lot between a reference and the given query genomes. How to handle harsh outliers for example? As a simple approach we played around with classic arithmetic/geometric/harmonic means....
Hello, thanks for this great tool.
Just a question:
I wonder how to select the appropriate reference for a set of (diverse) genomes.
When I run the referenceseeker in this case, it gives different reference for each genome.
The text was updated successfully, but these errors were encountered: