Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly implement lower convex envelope in RCP pruning logic #368

Open
nv-rborkar opened this issue May 10, 2024 · 0 comments
Open

Correctly implement lower convex envelope in RCP pruning logic #368

nv-rborkar opened this issue May 10, 2024 · 0 comments

Comments

@nv-rborkar
Copy link
Contributor

As the comment at https://github.com/mlcommons/logging/blob/master/mlperf_logging/rcp_checker/rcp_checker.py#L246 says, the loop does not correctly implement the "lower convex envelope" that was specified in the original RCP specification. There is an off by one error. If a point X needs to be pruned because it is greater than the interpolation of X-1 and X+1, then the point to the left of point X (X-1) needs to be retested against the interpolation of points X-2 and X+1. The increment at line 256 should be in an else clause.
This bug leads to bad RCPs not getting pruned which leads to submissions getting either unfairly rejected or unfairly "scaled" when they have a global batch size near the bad RCP point

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant