You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Certain job metrics are not being stored correctly in the database. This makes it more difficult to investigate system performance questions like "how many people would be affected if we put a limit on number of SNPs submitted".
There may be alternate methods to grab the data eventually from job logs, but not as convenient. It's not an urgent fix, but definitely a "gotcha".
Tracking in case this surprises anyone else!
Description/ root cause
A counter value like genotypes is calculated by multiplying two large ints, like "genotypes * samples". The result is bigger than the maximum java value for that type (2147483647 for signed ints)
Java represents this as a much smaller number
The correct value is shown in UI / job logs (which are stored separately as a pre-constructed text string), but the wrong value is stored in the DB table.
This affects both the initial calculation, and the incCounters method (which accepts an int).
Example
A recent job submitted 2.5M SNPs with 15k samples. (3.75 e 10) The Java max value for an int is ~2.1B. The resulting value is wrapped to ~3e6. The correct # of SNPs and samples are shown in the job report (where they are represented separately), but the values in the report do not match the numbers stored in the database (which are multiplied together).
In practice, this is usually not obvious until one needs to query to find big jobs. A subtler sign of an issue is that in TIS, ~10% of "genotypes" counters are < 0.
select count(*) from counters where name='genotypes' and value <0;
Note: the MySQL table definition would already support bigger numbers (counters.value = bigint column type). The issue appears to be in java.
The text was updated successfully, but these errors were encountered:
Summary
Certain job metrics are not being stored correctly in the database. This makes it more difficult to investigate system performance questions like "how many people would be affected if we put a limit on number of SNPs submitted".
There may be alternate methods to grab the data eventually from job logs, but not as convenient. It's not an urgent fix, but definitely a "gotcha".
Tracking in case this surprises anyone else!
Description/ root cause
This affects both the initial calculation, and the
incCounters
method (which accepts an int).Example
A recent job submitted 2.5M SNPs with 15k samples. (3.75 e 10) The Java max value for an int is ~2.1B. The resulting value is wrapped to ~3e6. The correct # of SNPs and samples are shown in the job report (where they are represented separately), but the values in the report do not match the numbers stored in the database (which are multiplied together).
In practice, this is usually not obvious until one needs to query to find big jobs. A subtler sign of an issue is that in TIS, ~10% of "genotypes" counters are < 0.
select count(*) from counters where name='genotypes' and value <0;
Note: the MySQL table definition would already support bigger numbers (counters.value = bigint column type). The issue appears to be in java.
The text was updated successfully, but these errors were encountered: