Implementing tension statistics #333

DilyOng · 2023-08-24T11:27:13Z

Description

This is a work in progress pull request aiming to address #325 and as a learning exercise on how to do pull request.

Checklist:

I have performed a self-review of my own code
My code is PEP8 compliant (flake8 anesthetic tests)
My code contains compliant docstrings (pydocstyle --convention=numpy anesthetic)
New and existing unit tests pass locally with my changes (python -m pytest)
I have added tests that prove my fix is effective or that my feature works
I have appropriately incremented the semantic version number in both README.rst and anesthetic/_version.py

codecov · 2023-08-24T11:35:08Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (ccb2e76) to head (e42b048).

Additional details and impacted files

@@            Coverage Diff            @@
##            master      #333   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           36        37    +1     
  Lines         3076      3104   +28     
=========================================
+ Hits          3076      3104   +28

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

williamjameshandley · 2023-08-24T12:25:16Z

Hi @DilyOng, many thanks for taking charge of incorporating this. Let's get it plumbed into anesthetic first, and then get feedback from others on if anything is missing.

At the moment, this code is specialised to a specific naming scheme (which is what the union and intersection functions are doing), and for a wider grid.

I think we should re-organise this so that in the first instance it is more similar to @AdamOrmondroyd's suspiciousness package, but retaining the class/cacheing structure of tension_calculator.

Tasks:

Create a class TensionCalculator in a new file anesthetic/tension.py (note CamelCase rather than under_score naming)
This class should __init__ with A, B and AB which are assumed to be NestedSamples, and cache self.A = A.stats(nsamples) (same for B and AB, which computes the nested sampling statistics (see also the docs)
This should then implement logR logS d D_KL and p
You should then create a tests/test_tension.py in the same style as the other test files, which uses anesthetic.examples.perfect_ns.correlated_gaussian functions to create mock A, B and AB, alongside equations 14 to 25 from 1902.04029 to test that the tension statitistics code gets the correct answer.

williamjameshandley · 2023-08-24T13:05:07Z

I think after that it would also be good to implement a function in addition to (or possibly in place of!) the class for producing a Samples object containing columns of logR, D_KL, d, S, p, as an analogue to the output of NestedSamples.stats.

tension_calculator.py

AdamOrmondroyd · 2023-08-31T20:51:21Z

Please remember to remove (git remove) the .DSstore files you've added (they are to do with macOS file management, so not relevant to the repo)

…ation and testing it with correlated gaussian likelihoods. Found a problem with the function anesthetic.examples.perfect_ns.correlated_gaussian. The generated likelihood gaussian in the parameters is not normalised and the evidence is not unity. Need to take into account the LogLmax.

…ed_gaussian. Within the correlated_gaussian function, changed logLike function. Changed the function's description to match the fact that evidence is not unity.

…kelihood test case with the tests folder.

…nction tension_stats() for calculating tension statistics. Rewrote the test_tension_stats.py in tests to match the format of other files. It tests mock datasets with guassian likelihood. Both compatiable and incompatiable datasets have passed the test.

…dd a file for datasets pairwise_comparison, but not completed

williamjameshandley · 2024-02-28T13:41:53Z

It would be good to get this finalised and merged now that #348 is complete -- any thoughts @DilyOng

…the theoretical logR, logS and logI values sit within 3 std of the numerical solution's distribution from anesthetic, instead of testing between minimum and maximum values of the distribution.

anesthetic/tension.py

anesthetic/tension_pvalue.py

… computation to save computing time for high-nsamples runs

…ng stats to tension stats

anesthetic/tension.py

…d, inlcuding anesthetic/tension.py and tests/test_tension.py.

williamjameshandley · 2025-02-04T12:08:14Z

OK @DilyOng they seem to be running for me -- try correcting the version number in the README and init.py back to 2.10.0 to see if it's working now.

…nesthetic/_version.py

DilyOng · 2025-02-04T12:39:46Z

OK @DilyOng they seem to be running for me -- try correcting the version number in the README and init.py back to 2.10.0 to see if it's working now.

@williamjameshandley It's all done - All checks have passed!

williamjameshandley

OK, I think this is ready for merging -- any further suggestions can be presented as additional pull requests.

Many thanks @DilyOng. Please press 'squash and merge'.

williamjameshandley · 2025-02-04T13:56:47Z

(Actually I think @AdamOrmondroyd also has to confirm the changes before merging)

lukashergt

Hi @williamjameshandley and @DilyOng,

I have already been testing this branch out a bit. In that context I realised that the current state is not quite flexible enough. Currently, we consider two datasets A and B separately, and the joint dataset AB. However, there are places where we will want to look at more than two datasets at the same time, e.g. A, B, and C. So we might want a more flexible... Thoughts?

I already have some modifications that could address this locally. Would it be ok for me to highjack this PR with these changes?

lukashergt · 2025-02-04T14:09:24Z

OK, I think this is ready for merging -- any further suggestions can be presented as additional pull requests.

Our comments crossed. The suggestions in my previous comment would be major change (in the semantic versioning lingo) to the user interface of these tension functions...

lukashergt · 2025-02-04T14:34:14Z

anesthetic/tension.py

+    - ``I``: information ratio
+
+      .. math::
+        I = exp(D_{KL}^{A} + D_{KL}^{B} - D_{KL}^{AB})


This is not quite what I meant. I was suggesting the following re-definition of equation (9) in the Quantifying tensions paper (note the lack of exp):

I = D_A + D_B - D_AB

such that equation (10) becomes:

logS = logR - I

@lukashergt if we're doing arbitrary numbers of datasets, then we'll need to tweak these equations too, something like
$$I = \sum_i {\mathcal{D}_\mathrm{KL}}_i - {\mathcal{D}_\mathrm{KL}}_{H_0}$$
?

Sidenote: oooh, neat, I didn't know that Markdown can by now handle math input :)

That said, for docstrings I would go for maximal readability even without rendering, so I'd say a simple math example is enough. Leave the rest to papers, or if really necessary, write a dedicated documentation page...?

@lukashergt Thank you very much for clarifying. I have changed the relevant lines.

lukashergt · 2025-02-04T14:34:54Z

anesthetic/tension.py

+    samples['logR'] = statsAB['logZ'] - statsA['logZ'] - statsB['logZ']
+    samples.set_label('logR', r'$\ln\mathcal{R}$')
+
+    samples['I'] = np.exp(statsA['D_KL'] + statsB['D_KL'] - statsAB['D_KL'])


Accordingly, this should be without exp, too.

@lukashergt Updated!

lukashergt · 2025-02-04T14:36:58Z

tests/test_tension.py


-    assert s.logS.mean() == approx(s.logR.mean() - s.logI.mean(),
+    assert s.logS.mean() == approx(s.logR.mean() - np.log(s.I).mean(),


And accordingly this should not have the np.log.

@lukashergt Updated.

AdamOrmondroyd · 2025-02-05T09:35:50Z

anesthetic/tension.py

+import numpy as np
+
+
+def stats(A, B, AB, nsamples=None, beta=None):  # noqa: D301


@lukashergt @DilyOng for multiple datasets, I think unpacking will neatly handle arbitrary numbers of datasets, something like

def stats(h0, *h1, nsamples=None, beta=None): ```h0 = null hypothesis = AB, h1stats = alternative hypothesis = A, B etc``` ... samples['logR'] = h0stats['logZ'] - sum(_h1stats['logZ'] for _h1stats in h1stats) ...

which can be called tension.stats(abcde, a, b, c, d, e)

Yes, that's about what I have. I call them joint and separate, which I find a bit more descriptive...

AdamOrmondroyd · 2025-02-05T09:35:56Z

anesthetic/tension.py

+    Parameters
+    ----------
+    A : :class:`anesthetic.samples.Samples`
+        :class:`anesthetic.samples.NestedSamples` object from a sampling run


This is contradictory: does A have to be Samples or NestedSamples?

AdamOrmondroyd · 2025-02-05T09:44:40Z

anesthetic/tension.py

+    - ``I``: information ratio
+
+      .. math::
+        I = exp(D_{KL}^{A} + D_{KL}^{B} - D_{KL}^{AB})


@lukashergt if we're doing arbitrary numbers of datasets, then we'll need to tweak these equations too, something like
$$I = \sum_i {\mathcal{D}_\mathrm{KL}}_i - {\mathcal{D}_\mathrm{KL}}_{H_0}$$
?

…he previous log I, the logarithm is incorporated in I. Updated all relevant lines in anesthetic/tension.py and tests/test_tension.py.

Added Will's example Class for tension calculator

c0946bd

Changed Version from 2.3.0 to 2.4.0

70c0aa4

williamjameshandley assigned williamjameshandley and DilyOng Aug 24, 2023

williamjameshandley added enhancement New feature or request good first issue Good for newcomers labels Aug 24, 2023

AdamOrmondroyd requested changes Aug 24, 2023

View reviewed changes

tension_calculator.py Outdated Show resolved Hide resolved

Added Class tension

ee1db5b

lukashergt and others added 12 commits September 29, 2023 17:20

Merge branch 'master' into tension

ffd5fae

version bump to 2.5.0

a3d06b5

Added logLmax to the function anesthetic.examples.perfect_ns.correlat…

38d31c2

…ed_gaussian. Within the correlated_gaussian function, changed logLike function. Changed the function's description to match the fact that evidence is not unity.

Clean up old tension statistics files. Put the correlated guassian li…

932a201

…kelihood test case with the tests folder.

Merge branch 'tension' of github.com:handley-lab/anesthetic into tension

4046bc3

Updated logLmax

de17f04

remove DS_Store

6ab8e4f

Merge branch 'master' into tension

3698669

Updated the tests/test_tension_stats.py file for flake8 compliance. A…

2f472b5

…dd a file for datasets pairwise_comparison, but not completed

Updated tests/test_tension_stats.py

5209ad2

DilyOng and others added 3 commits March 4, 2024 18:42

Updated anesthetic/tests/test_tension_stats.py. Now it tests whether …

6fb5fe0

…the theoretical logR, logS and logI values sit within 3 std of the numerical solution's distribution from anesthetic, instead of testing between minimum and maximum values of the distribution.

Merge branch 'master' into tension

151a91d

bump version to 2.9.0

b0994b3

AdamOrmondroyd reviewed Mar 5, 2024

View reviewed changes

anesthetic/tension.py Outdated Show resolved Hide resolved

anesthetic/tension.py Outdated Show resolved Hide resolved

anesthetic/tension_pvalue.py Outdated Show resolved Hide resolved

Deleted duplicate files.

aec6a25

lukashergt added 7 commits September 26, 2024 17:55

surround @ by spaces

5675e25

remove occurences of missing leading 0, e.g. .01

a7b7fb5

use ln rather than log for the latex labels

e3bf650

streamline and speed up tension tests a bit

9dae22f

make tension docstring more readable

8c70f2f

optionally allow for passing a pre-computed stats instance to tension…

88b7ec6

… computation to save computing time for high-nsamples runs

simplify tension tests and add test for direct input of nested sampli…

51c0975

…ng stats to tension stats

lukashergt requested review from AdamOrmondroyd and williamjameshandley September 27, 2024 09:52

lukashergt reviewed Sep 27, 2024

View reviewed changes

anesthetic/tension.py Outdated Show resolved Hide resolved

lukashergt and others added 6 commits September 27, 2024 14:28

Update README.rst

0649d53

Update _version.py

6b07788

Merge branch 'master' into tension

ce82e13

Changed log I to I for suspiciousness. Corresponding files are change…

d31b6e0

…d, inlcuding anesthetic/tension.py and tests/test_tension.py.

Restart Github pipeline

d2db456

Merge branch 'master' into tension

7de2543

Changed version number from 2.11.0 back to 2.10.0 in README.rst and a…

7717b04

…nesthetic/_version.py

williamjameshandley previously approved these changes Feb 4, 2025

View reviewed changes

lukashergt requested changes Feb 4, 2025

View reviewed changes

lukashergt reviewed Feb 4, 2025

View reviewed changes

AdamOrmondroyd requested changes Feb 5, 2025

View reviewed changes

Redefined I = D_A + D_B - D_AB, and logS = logR - I, such that I is t…

71325d7

…he previous log I, the logarithm is incorporated in I. Updated all relevant lines in anesthetic/tension.py and tests/test_tension.py.

DilyOng dismissed williamjameshandley’s stale review via 71325d7 February 5, 2025 14:32

Removed unused python package numpy in anesthetic/tension.py.

e42b048

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing tension statistics #333

Implementing tension statistics #333

DilyOng commented Aug 24, 2023 •

edited

Loading

codecov bot commented Aug 24, 2023 •

edited

Loading

williamjameshandley commented Aug 24, 2023

williamjameshandley commented Aug 24, 2023 •

edited

Loading

AdamOrmondroyd commented Aug 31, 2023

williamjameshandley commented Feb 28, 2024

williamjameshandley commented Feb 4, 2025

DilyOng commented Feb 4, 2025

williamjameshandley left a comment

williamjameshandley commented Feb 4, 2025

lukashergt left a comment

lukashergt commented Feb 4, 2025

lukashergt Feb 4, 2025

AdamOrmondroyd Feb 5, 2025

lukashergt Feb 5, 2025

DilyOng Feb 5, 2025

lukashergt Feb 4, 2025

DilyOng Feb 5, 2025

lukashergt Feb 4, 2025

DilyOng Feb 5, 2025

AdamOrmondroyd Feb 5, 2025

lukashergt Feb 5, 2025

AdamOrmondroyd Feb 5, 2025

AdamOrmondroyd Feb 5, 2025


		assert s.logS.mean() == approx(s.logR.mean() - s.logI.mean(),
		assert s.logS.mean() == approx(s.logR.mean() - np.log(s.I).mean(),

		import numpy as np


		def stats(A, B, AB, nsamples=None, beta=None): # noqa: D301

Implementing tension statistics #333

Are you sure you want to change the base?

Implementing tension statistics #333

Conversation

DilyOng commented Aug 24, 2023 • edited Loading

Description

Checklist:

codecov bot commented Aug 24, 2023 • edited Loading

Codecov Report

williamjameshandley commented Aug 24, 2023

williamjameshandley commented Aug 24, 2023 • edited Loading

AdamOrmondroyd commented Aug 31, 2023

williamjameshandley commented Feb 28, 2024

williamjameshandley commented Feb 4, 2025

DilyOng commented Feb 4, 2025

williamjameshandley left a comment

Choose a reason for hiding this comment

williamjameshandley commented Feb 4, 2025

lukashergt left a comment

Choose a reason for hiding this comment

lukashergt commented Feb 4, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DilyOng commented Aug 24, 2023 •

edited

Loading

codecov bot commented Aug 24, 2023 •

edited

Loading

williamjameshandley commented Aug 24, 2023 •

edited

Loading