Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rescale optimizer termination criteria #1073

Open
wants to merge 23 commits into
base: master
Choose a base branch
from
Open

Conversation

f0uriest
Copy link
Member

@f0uriest f0uriest commented Jun 25, 2024

Most of our optimizers have 3 main exit criteria:
$$||\Delta f || / ||f|| < ftol$$
$$||\Delta x || / ||x|| < xtol$$
$$||\nabla f ||_\infty < gtol$$

The normalizations in the objective functions generally ensure that the first one (ftol) is more or less independent of scaling/units etc, however xtol and gtol still have units of x or 1/x respectively, meaning that rescaling variables can affect the stopping criteria. This is especially noticeable when doing coil optimization or optimizing a current profile, since current is ~1e6x larger than the other degrees of freedom, it can mean exiting early once the current reaches a good value but the other variables can still be far from optimal.

This PR adds an option scaled_termination (defaults to True) to all of the desc optimizers to measure the norms for xtol and gtol in the scaled norm provided by x_scale (which defaults to using an adaptive scaling based on the Jacobian or Hessian). This should make things a bit better when optimizing parameters with widely different magnitudes.

Resolves #797

@f0uriest f0uriest marked this pull request as draft June 25, 2024 23:50
Copy link
Contributor

github-actions bot commented Jun 26, 2024

|             benchmark_name             |         dt(%)          |         dt(s)          |        t_new(s)        |        t_old(s)        | 
| -------------------------------------- | ---------------------- | ---------------------- | ---------------------- | ---------------------- |
 test_build_transform_fft_midres         |     +6.93 +/- 6.89     | +4.12e-02 +/- 4.09e-02 |  6.35e-01 +/- 3.0e-02  |  5.94e-01 +/- 2.7e-02  |
 test_build_transform_fft_highres        |     +3.24 +/- 2.74     | +3.08e-02 +/- 2.60e-02 |  9.81e-01 +/- 2.1e-02  |  9.50e-01 +/- 1.6e-02  |
 test_equilibrium_init_lowres            |     -6.09 +/- 7.64     | -2.45e-01 +/- 3.07e-01 |  3.78e+00 +/- 8.0e-02  |  4.02e+00 +/- 3.0e-01  |
 test_objective_compile_atf              |     -0.69 +/- 3.21     | -5.36e-02 +/- 2.51e-01 |  7.77e+00 +/- 2.1e-01  |  7.82e+00 +/- 1.3e-01  |
 test_objective_compute_atf              |     -2.18 +/- 2.04     | -2.30e-04 +/- 2.15e-04 |  1.03e-02 +/- 1.4e-04  |  1.06e-02 +/- 1.6e-04  |
 test_objective_jac_atf                  |     -3.26 +/- 2.92     | -6.30e-02 +/- 5.66e-02 |  1.87e+00 +/- 3.8e-02  |  1.94e+00 +/- 4.2e-02  |
 test_perturb_1                          |     +0.81 +/- 3.03     | +1.13e-01 +/- 4.19e-01 |  1.39e+01 +/- 3.1e-01  |  1.38e+01 +/- 2.8e-01  |
 test_proximal_jac_atf                   |     +0.33 +/- 1.03     | +2.63e-02 +/- 8.36e-02 |  8.12e+00 +/- 7.8e-02  |  8.09e+00 +/- 3.1e-02  |
 test_proximal_freeb_compute             |     -1.46 +/- 0.92     | -2.89e-03 +/- 1.82e-03 |  1.95e-01 +/- 9.0e-04  |  1.98e-01 +/- 1.6e-03  |
 test_build_transform_fft_lowres         |     +1.34 +/- 4.19     | +7.01e-03 +/- 2.19e-02 |  5.30e-01 +/- 2.0e-02  |  5.23e-01 +/- 9.4e-03  |
 test_equilibrium_init_medres            |     +0.17 +/- 1.73     | +7.07e-03 +/- 7.19e-02 |  4.16e+00 +/- 3.3e-02  |  4.15e+00 +/- 6.4e-02  |
 test_equilibrium_init_highres           |     -1.58 +/- 2.10     | -8.69e-02 +/- 1.16e-01 |  5.43e+00 +/- 4.3e-02  |  5.52e+00 +/- 1.1e-01  |
 test_objective_compile_dshape_current   |     +0.15 +/- 1.81     | +5.86e-03 +/- 7.00e-02 |  3.88e+00 +/- 4.8e-02  |  3.87e+00 +/- 5.1e-02  |
 test_objective_compute_dshape_current   |     -1.18 +/- 1.87     | -4.34e-05 +/- 6.85e-05 |  3.62e-03 +/- 4.2e-05  |  3.67e-03 +/- 5.4e-05  |
 test_objective_jac_dshape_current       |     -0.95 +/- 8.46     | -3.89e-04 +/- 3.45e-03 |  4.04e-02 +/- 2.5e-03  |  4.08e-02 +/- 2.4e-03  |
 test_perturb_2                          |     +0.39 +/- 1.53     | +7.39e-02 +/- 2.88e-01 |  1.89e+01 +/- 2.0e-01  |  1.88e+01 +/- 2.1e-01  |
 test_proximal_freeb_jac                 |     +0.34 +/- 1.23     | +2.55e-02 +/- 9.22e-02 |  7.51e+00 +/- 7.6e-02  |  7.48e+00 +/- 5.2e-02  |
 test_solve_fixed_iter                   |     +1.68 +/- 62.58    | +8.31e-02 +/- 3.10e+00 |  5.03e+00 +/- 2.3e+00  |  4.95e+00 +/- 2.1e+00  |

@YigitElma
Copy link
Collaborator

A suggestion for the intended functionality, can we just take the ratio step_ratio=step/x and check if any element of step_ratio is smaller then some new tolerance rtol?

@f0uriest f0uriest marked this pull request as ready for review November 13, 2024 06:34
Copy link

codecov bot commented Nov 13, 2024

Codecov Report

Attention: Patch coverage is 97.72727% with 1 line in your changes missing coverage. Please review.

Project coverage is 95.57%. Comparing base (2741269) to head (24da16f).

Files with missing lines Patch % Lines
desc/optimize/aug_lagrangian_ls.py 93.75% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1073      +/-   ##
==========================================
- Coverage   95.57%   95.57%   -0.01%     
==========================================
  Files          96       96              
  Lines       24405    24409       +4     
==========================================
+ Hits        23325    23328       +3     
- Misses       1080     1081       +1     
Files with missing lines Coverage Δ
desc/optimize/aug_lagrangian.py 96.95% <100.00%> (+0.88%) ⬆️
desc/optimize/fmin_scalar.py 98.08% <100.00%> (+0.01%) ⬆️
desc/optimize/least_squares.py 99.33% <100.00%> (+<0.01%) ⬆️
desc/optimize/aug_lagrangian_ls.py 95.69% <93.75%> (+0.02%) ⬆️

... and 2 files with indirect coverage changes

---- 🚨 Try these New Features:

tests/conftest.py Outdated Show resolved Hide resolved
tests/test_examples.py Show resolved Hide resolved
@YigitElma YigitElma self-requested a review November 20, 2024 07:18
Copy link
Collaborator

@YigitElma YigitElma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also resolve #1395 here

YigitElma
YigitElma previously approved these changes Nov 21, 2024
Copy link
Collaborator

@dpanici dpanici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to re-run any notebooks? or do they all not change?

@dpanici
Copy link
Collaborator

dpanici commented Nov 21, 2024

The advanced optimization notebook has some differing results now (probably due to differing termination events of the eq subproblem with this branch vs master, at least for the proximal case), the auglag result is slightly worse and the QS results change slightly too. I think it would be good to run all the notebooks again (or at least some of them, or make an issue/other branches that runs them and updates them) so that what we have up on the docs is what actually a user would see when running the notebook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

normalize stopping criteria
3 participants