Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

961 pairplot #1084

Merged
merged 41 commits into from
Apr 29, 2024
Merged

961 pairplot #1084

merged 41 commits into from
Apr 29, 2024

Conversation

Matthijspals
Copy link
Contributor

@Matthijspals Matthijspals commented Mar 21, 2024

What does this implement/fix? Explain your changes

Refactoring of the pairplot and marginal_plot functions, to make the plotting more transparent and easier to understand.

Here is a summary of all changes:

Passing keywords:

  • Figure kwargs are now passed in separate upper_kwargs,lower_kwargs, diag_kwargs and fig_kwargs dictionaries (similar to seaborn), making it easier to organise keywords
  • upper_kwargs,lower_kwargs, and diag_kwargs can include mpl_kwargs, which is directly passed to the matplotlib plotting function used, making it straightforward to change e.g. the cmap used.
  • backwards compatibility is ensured. If to-be-deprecated keywords are used, the previous version of plotting functions are called and a deprecation warning is thrown.

Tutorial:

  • A new tutorial is added that demonstrates some of the plotting options.

Restructuring of functions:

  • get_offdiag_funcs is declared outside of pairplot, bringing it inline with get_diac_funcs, and making pairplot easier to parse
  • all plotting functions, e.g., plt_hist_1d, plt_kde_2d, have their own function, making the code easier to parse, and making it easier to add new functions
  • arrange_plots is replaced by arrange_grid, which uses the newly defined plotting functions and avoids mixing plt and ax
  • styling the figure is done in a separate new function format_subplot
  • histograms can make use of the Freedman-Diaconis heuristic for selecting binsizes
  • diag can be None to only plot 2D marginals

Does this close any currently open issues?

Adresses #961

Any relevant code examples, logs, error output, etc?

See here for more examples

image

Any other comments?

  • I did not change any of the defaults, as I felt it is quite subjective what is desired, we could discuss whether defaults should be changed or not.

  • Once this is roughly approved conditional_pairplot and conditional_marginal_plot should be updated to be inline with the new functions.

  • Some tutorial plots should use updated syntax for plotting to avoid throwing deprecation warnings.

Checklist

Put an x in the boxes that apply. You can also fill these out after creating
the PR. If you're unsure about any of them, don't hesitate to ask. We're here to
help! This is simply a reminder of what we are going to look for before merging
your code.

  • I have read and understood the contribution
    guidelines
  • I agree with re-licensing my contribution from AGPLv3 to Apache-2.0.
  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works
  • I have reported how long the new tests run and potentially marked them
    with pytest.mark.slow.
  • New and existing unit tests pass locally with my changes
  • I performed linting and formatting as described in the contribution
    guidelines
  • I rebased on main (or there are no conflicts with main)

@Matthijspals Matthijspals linked an issue Mar 21, 2024 that may be closed by this pull request
Copy link
Contributor

@janfb janfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high-level review: This is so great!!! Thanks a ton for improving the plotting 🎉

Before a detailed review,

  • can you please add type annotations to the arguments of the new functions,
  • and run pre-commit hooks. it seems the formatting is inconsistent. make sure you are using pre-commit --version 3.5.0. then run pre-commit run --all-files

thanks 🙏

@Matthijspals
Copy link
Contributor Author

Matthijspals commented Mar 22, 2024

high-level review: This is so great!!! Thanks a ton for improving the plotting 🎉

Before a detailed review,

  • can you please add type annotations to the arguments of the new functions,
  • and run pre-commit hooks. it seems the formatting is inconsistent. make sure you are using pre-commit --version 3.5.0. then run pre-commit run --all-files

thanks 🙏

Checks have now passed.

I will still update the plot tests (which run but throw many deprecation warnings), add typing for the non main functions, and update the conditional plots next week

@Matthijspals
Copy link
Contributor Author

@famura Thanks a lot for your commits! Regarding the "Warning-free testing of pairplot" I think we should update the tests, instead of (or in addition to?) adding ignore::DeprecationWarning statements?

I don't know what you mean by that since I understood you that we will nevertheless encounter these warnings until we delete the old plotting function and with that the conversion and warning. I think it is very safe to ignore our own warnings for specified tests.

Sorry for being unclear! What I meant is, we should a add a new test for the updated pairplot function (which shouldn't throw deprecation warnings if used correctly)

@Matthijspals
Copy link
Contributor Author

The new functions (and old functions used by them) now all have typing.

TODOs are:

  • update conditional plotting functions
  • update tests
  • update tutorials (some will now throw deprecation warnings)

@famura
Copy link
Contributor

famura commented Mar 28, 2024

Sorry for being unclear! What I meant is, we should a add a new test for the updated pairplot function (which shouldn't throw deprecation warnings if used correctly)

No worries, now I got you and agree. In a world with enough time, should do that. pytest also allows for marks that ignore things on a per-function-level. It didn't work instantly for me, so I went for this solution. Feel free to change it. However, I think that our time is better spent on other things :)

Copy link
Contributor

@janfb janfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great effort!
I added a couple of comments. Will have a look at the tutorial later.

Copy link

codecov bot commented Apr 4, 2024

Codecov Report

Attention: Patch coverage is 33.94919% with 286 lines in your changes are missing coverage. Please review.

Project coverage is 74.21%. Comparing base (4c2f71e) to head (df58e86).
Report is 63 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1084      +/-   ##
==========================================
- Coverage   76.37%   74.21%   -2.17%     
==========================================
  Files          84       90       +6     
  Lines        6507     6927     +420     
==========================================
+ Hits         4970     5141     +171     
- Misses       1537     1786     +249     
Flag Coverage Δ
unittests 74.21% <33.94%> (-2.17%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
sbi/analysis/plot.py 45.50% <33.94%> (-25.68%) ⬇️

... and 43 files with indirect coverage changes

Copy link
Contributor

@janfb janfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in reviewing!

It all looks good now, except for the two comments about the unused limit kwargs - I think it's better to remove them.

I also had a look at the tutotial and I think it's great!! 👏
Just one comment: it is using the example_posterior, so the data is in parameter space. However, you are using x_1 (...) as labels later. I suggest to change this to theta_1 (...) and to replace x_o with theta_o.

Thanks again for this revolutionary plotting update! 🙏

@janfb
Copy link
Contributor

janfb commented Apr 23, 2024

Ping @Matthijspals :)
I think this is almost done. It just needs the two changes about the kwargs and the renaming of x to theta in the tutorial.

@Matthijspals
Copy link
Contributor Author

Matthijspals commented Apr 23, 2024

Ping @Matthijspals :)
I think this is almost done. It just needs the two changes about the kwargs and the renaming of x to theta in the tutorial.

Hi Jan, sorry for the delay!

Do you mean to change:

def plt_name(
    ax: Axes,
    samples: np.ndarray,
    limits: torch.Tensor,
    kwargs: Dict,
) -> None:

into:

def plt_name(
    ax: Axes,
    samples: np.ndarray,
    diag_kwargs: Dict,
    **kwargs: Optional[Any]
) -> None:

or to make to limits part of the kwargs Dict?

@janfb
Copy link
Contributor

janfb commented Apr 24, 2024

No worries!

  1. If the arguments are not used at all, which is the case for the limit args for those two functions I believe, then we just drop them. Otherwise it might be confusing, e.g., a user passes them to set limits but nothing changes.

  2. If it is useful to pass them on to scatter, then we should add a **kwargs and pass that on. But as I understand it, option 1 makes more sense.

@Matthijspals
Copy link
Contributor Author

Matthijspals commented Apr 25, 2024

No worries!

  1. If the arguments are not used at all, which is the case for the limit args for those two functions I believe, then we just drop them. Otherwise it might be confusing, e.g., a user passes them to set limits but nothing changes.
  2. If it is useful to pass them on to scatter, then we should add a **kwargs and pass that on. But as I understand it, option 1 makes more sense.

Just dropping the limits argument won't work as the _arange_grid functions calls all plotting functions (e.g. plt_scatter_1d) as diag_f(ax, sample, limits, diag_kwargs) - we would have to get rid of them for every function. In that case it is easiest to just add the limits to diag_kwargs, shall I do that?

@janfb
Copy link
Contributor

janfb commented Apr 25, 2024

Oh, I see. I finally understand why they were there in the first place :)
I am sorry for the back and forth then.

Then I suggest that we just keep them and add an inline comment or info in the docstring why they are there.
Another question I should have asked earlier: why are the limits not used in scatter or to plot, e.g., setting plt.xlim(limits...)?

@Matthijspals
Copy link
Contributor Author

Oh, I see. I finally understand why they were there in the first place :) I am sorry for the back and forth then.

Then I suggest that we just keep them and add an inline comment or info in the docstring why they are there.

Will do!

Another question I should have asked earlier: why are the limits not used in scatter or to plot, e.g., setting plt.xlim(limits...)?

The x and y limits are currently set for every plot in the _format_subplot function. Some of the plotting functions additionally need the limits, e.g., to set the extent of imshow.

Copy link
Contributor

@janfb janfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!
Thanks a lot @Matthijspals 👏

@janfb janfb merged commit 1fadce3 into main Apr 29, 2024
4 of 5 checks passed
@janfb janfb deleted the 961-pairplot branch April 29, 2024 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pairplot: documentation, more options and tutorial
3 participants