Chunk TAX_IDTAXA
input to speed up taxonomic assignment
#16
Labels
rework
Redoing or refining something
TAX_IDTAXA
input to speed up taxonomic assignment
#16
Big bottleneck at the moment with larger datasets is
TAX_IDTAXA
, which can only run with a single core and time required scales with the combination of # of input sequences and size of the reference database.One way around this could be to split the input sequence table into chunks of fixed size (perhaps 100 sequences?), then run those through
TAX_IDTAXA
, combining them at the end. This would require newTAX_CHUNK
(?) andTAX_COMBINE
(?) modules to bookend the existing module.The text was updated successfully, but these errors were encountered: