-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scientific notation in abundance file result in rounding errors #39
Labels
Comments
This is clearly a bug. I'll look into it. |
Thanks André. Much appreciated. |
The latest release should fix this issue. |
@donovan-h-parks: could you maybe check, if this fixes your outputs? |
Hi @muellan, I'm still encountering this issue in v2.4.2, e.g.:
I run MetaCache as follows:
MetaCache version details:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi. We've run into a small issue that we are hoping can be fixed in the next release. The abundance profile produced with the
-abundances
flag reports pair counts in scientific notation when numbers get large, e.g.:This can result in small errors due to rounding. For example, in this case there is really 1050675 Bacterial read pairs, but it gets rounded up to 1050680. While having 5 extra read pairs is minor in terms of the resulting abundance estimates it makes it challenging to track the fait of all reads. In our code, we have a check that the number of input reads is equal to the number of reads in the MetaCache abundance profile (including unclassified). This is just a unit test to ensure our parsing is correct and that no reads are lost during any manipulation of data, but, more generally, not being able to account for all reads is a bit scary.
I imagine the intent is for this profile to produced integers, so am hoping this can be fixed in the next release. Thanks.
The text was updated successfully, but these errors were encountered: