You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As the comment at https://github.com/mlcommons/logging/blob/master/mlperf_logging/rcp_checker/rcp_checker.py#L246 says, the loop does not correctly implement the "lower convex envelope" that was specified in the original RCP specification. There is an off by one error. If a point X needs to be pruned because it is greater than the interpolation of X-1 and X+1, then the point to the left of point X (X-1) needs to be retested against the interpolation of points X-2 and X+1. The increment at line 256 should be in an else clause.
This bug leads to bad RCPs not getting pruned which leads to submissions getting either unfairly rejected or unfairly "scaled" when they have a global batch size near the bad RCP point
The text was updated successfully, but these errors were encountered:
As the comment at https://github.com/mlcommons/logging/blob/master/mlperf_logging/rcp_checker/rcp_checker.py#L246 says, the loop does not correctly implement the "lower convex envelope" that was specified in the original RCP specification. There is an off by one error. If a point X needs to be pruned because it is greater than the interpolation of X-1 and X+1, then the point to the left of point X (X-1) needs to be retested against the interpolation of points X-2 and X+1. The increment at line 256 should be in an else clause.
This bug leads to bad RCPs not getting pruned which leads to submissions getting either unfairly rejected or unfairly "scaled" when they have a global batch size near the bad RCP point
The text was updated successfully, but these errors were encountered: