Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing tension statistics #333

Open
wants to merge 69 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
c0946bd
Added Will's example Class for tension calculator
DilyOng Aug 24, 2023
70c0aa4
Changed Version from 2.3.0 to 2.4.0
DilyOng Aug 24, 2023
ee1db5b
Added Class tension
DilyOng Aug 31, 2023
ffd5fae
Merge branch 'master' into tension
lukashergt Sep 30, 2023
a3d06b5
version bump to 2.5.0
lukashergt Sep 30, 2023
4a5d4b5
Created a function called tension_stats for tension statistics calcul…
DilyOng Oct 13, 2023
38d31c2
Added logLmax to the function anesthetic.examples.perfect_ns.correlat…
DilyOng Oct 13, 2023
932a201
Clean up old tension statistics files. Put the correlated guassian li…
DilyOng Oct 13, 2023
4046bc3
Merge branch 'tension' of github.com:handley-lab/anesthetic into tension
DilyOng Oct 13, 2023
de17f04
Updated logLmax
DilyOng Oct 13, 2023
6ab8e4f
remove DS_Store
AdamOrmondroyd Oct 14, 2023
3698669
Merge branch 'master' into tension
AdamOrmondroyd Oct 14, 2023
aa0c177
Added a file tension.py to anesthetic/anesthetic and it contains a fu…
DilyOng Oct 26, 2023
2f472b5
Updated the tests/test_tension_stats.py file for flake8 compliance. A…
DilyOng Oct 26, 2023
5209ad2
Updated tests/test_tension_stats.py
DilyOng Dec 5, 2023
6fb5fe0
Updated anesthetic/tests/test_tension_stats.py. Now it tests whether …
DilyOng Mar 4, 2024
151a91d
Merge branch 'master' into tension
williamjameshandley Mar 5, 2024
b0994b3
bump version to 2.9.0
williamjameshandley Mar 5, 2024
aec6a25
Deleted duplicate files.
DilyOng Mar 5, 2024
46ec095
Merge branch 'tension' of github.com:handley-lab/anesthetic into tension
DilyOng Mar 5, 2024
be72001
Merge branch 'master' into tension
williamjameshandley Mar 18, 2024
03b9b1c
Reorganised tests
williamjameshandley Mar 18, 2024
9dd5ca1
numpy linalg
williamjameshandley Mar 18, 2024
e82f125
For anesthetic.tension.tension_stats(), I 1) added docstrings, 2) add…
DilyOng Mar 20, 2024
979f621
Added docstring in public module.
DilyOng Mar 21, 2024
84c8e6e
Fixed pydocstyle issue.
DilyOng Mar 21, 2024
9909190
Fixed pydocstyle issue.
DilyOng Mar 21, 2024
6723f4c
In test_tension.py, 1) added a test for latex labels, and 2) changed …
DilyOng Apr 4, 2024
9739361
Changed V
DilyOng Apr 4, 2024
25ab2c1
typographical changes
williamjameshandley Apr 5, 2024
1691a02
both tests correct
williamjameshandley Apr 5, 2024
110029a
Correction to docstring kl
williamjameshandley Apr 5, 2024
89fab36
Further docstring updates
williamjameshandley Apr 5, 2024
8f2e60a
Corrected typo
williamjameshandley Apr 8, 2024
290f257
Merge branch 'master' into tension
lukashergt Apr 8, 2024
815a7e5
fix link in docstring
lukashergt Apr 8, 2024
ab2d14e
add anesthetic.read.csv module to documentation
lukashergt Apr 8, 2024
c11c5f6
add anesthetic.tension module to documentation
lukashergt Apr 8, 2024
d1f1c2a
fix links and some formatting issues in `tension.py` docstrings
lukashergt Apr 8, 2024
fbd733e
Merge branch 'master' into tension
williamjameshandley Apr 9, 2024
d4659bd
Merge branch 'master' into tension
lukashergt Apr 9, 2024
877d79d
Merge branch 'master' into tension
williamjameshandley Apr 9, 2024
e568bfd
Merge branch 'master' into tension
lukashergt Apr 9, 2024
4947c69
Updated the docstrings in anesthetic/tension.py.
DilyOng Apr 14, 2024
7a10785
Updated docstrings
williamjameshandley Sep 19, 2024
970f431
Merge branch 'master' into tension
williamjameshandley Sep 19, 2024
69c932a
Updated docstring to avoid Samples
williamjameshandley Sep 19, 2024
7e90ac5
replaced \\m
williamjameshandley Sep 19, 2024
75d3067
Further string corrections
williamjameshandley Sep 19, 2024
740eacf
Further debugging docstrings
williamjameshandley Sep 19, 2024
d236f34
Logarithmic -> Logarithm
williamjameshandley Sep 19, 2024
79838dd
Merge branch 'master' into tension
lukashergt Sep 27, 2024
73d6180
correct spelling of compatible
lukashergt Sep 27, 2024
5675e25
surround `@` by spaces
lukashergt Sep 27, 2024
a7b7fb5
remove occurences of missing leading 0, e.g. `.01`
lukashergt Sep 27, 2024
e3bf650
use `ln` rather than `log` for the latex labels
lukashergt Sep 27, 2024
9dae22f
streamline and speed up tension tests a bit
lukashergt Sep 27, 2024
8c70f2f
make tension docstring more readable
lukashergt Sep 27, 2024
88b7ec6
optionally allow for passing a pre-computed stats instance to tension…
lukashergt Sep 27, 2024
51c0975
simplify tension tests and add test for direct input of nested sampli…
lukashergt Sep 27, 2024
0649d53
Update README.rst
lukashergt Sep 27, 2024
6b07788
Update _version.py
lukashergt Sep 27, 2024
ce82e13
Merge branch 'master' into tension
lukashergt Sep 27, 2024
d31b6e0
Changed log I to I for suspiciousness. Corresponding files are change…
DilyOng Feb 3, 2025
d2db456
Restart Github pipeline
DilyOng Feb 3, 2025
7de2543
Merge branch 'master' into tension
williamjameshandley Feb 4, 2025
7717b04
Changed version number from 2.11.0 back to 2.10.0 in README.rst and a…
DilyOng Feb 4, 2025
71325d7
Redefined I = D_A + D_B - D_AB, and logS = logR - I, such that I is t…
DilyOng Feb 5, 2025
e42b048
Removed unused python package numpy in anesthetic/tension.py.
DilyOng Feb 5, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 6 additions & 5 deletions anesthetic/tension.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""Tension statistics between two datasets."""
from anesthetic.samples import Samples
from scipy.stats import chi2
import numpy as np


def stats(A, B, AB, nsamples=None, beta=None): # noqa: D301
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukashergt @DilyOng for multiple datasets, I think unpacking will neatly handle arbitrary numbers of datasets, something like

def stats(h0, *h1, nsamples=None, beta=None):
    ```h0 = null hypothesis = AB, h1stats = alternative hypothesis = A, B etc```
    ...
    samples['logR'] = h0stats['logZ'] - sum(_h1stats['logZ'] for _h1stats in h1stats)
    ...

which can be called tension.stats(abcde, a, b, c, d, e)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's about what I have. I call them joint and separate, which I find a bit more descriptive...

Expand All @@ -13,10 +14,10 @@ def stats(A, B, AB, nsamples=None, beta=None): # noqa: D301
.. math::
\log R = \log Z_{AB} - \log Z_{A} - \log Z_{B}

- ``logI``: information ratio
- ``I``: information ratio

.. math::
\log I = D_{KL}^{A} + D_{KL}^{B} - D_{KL}^{AB}
I = exp(D_{KL}^{A} + D_{KL}^{B} - D_{KL}^{AB})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite what I meant. I was suggesting the following re-definition of equation (9) in the Quantifying tensions paper (note the lack of exp):

I = D_A + D_B - D_AB

such that equation (10) becomes:

logS = logR - I

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukashergt if we're doing arbitrary numbers of datasets, then we'll need to tweak these equations too, something like

$$I = \sum_i {\mathcal{D}_\mathrm{KL}}_i - {\mathcal{D}_\mathrm{KL}}_{H_0}$$

?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sidenote: oooh, neat, I didn't know that Markdown can by now handle math input :)

That said, for docstrings I would go for maximal readability even without rendering, so I'd say a simple math example is enough. Leave the rest to papers, or if really necessary, write a dedicated documentation page...?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukashergt Thank you very much for clarifying. I have changed the relevant lines.


- ``logS``: suspiciousness

Expand Down Expand Up @@ -65,7 +66,7 @@ def stats(A, B, AB, nsamples=None, beta=None): # noqa: D301
-------
samples : :class:`anesthetic.samples.Samples`
DataFrame containing the following tension statistics in columns:
['logR', 'logI', 'logS', 'd_G', 'p']
['logR', 'I', 'logS', 'd_G', 'p']
"""
columns = ['logZ', 'D_KL', 'logL_P', 'd_G']
if set(columns).issubset(A.drop_labels().columns):
Expand All @@ -89,8 +90,8 @@ def stats(A, B, AB, nsamples=None, beta=None): # noqa: D301
samples['logR'] = statsAB['logZ'] - statsA['logZ'] - statsB['logZ']
samples.set_label('logR', r'$\ln\mathcal{R}$')

samples['logI'] = statsA['D_KL'] + statsB['D_KL'] - statsAB['D_KL']
samples.set_label('logI', r'$\ln\mathcal{I}$')
samples['I'] = np.exp(statsA['D_KL'] + statsB['D_KL'] - statsAB['D_KL'])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accordingly, this should be without exp, too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukashergt Updated!

samples.set_label('I', r'$\mathcal{I}$')

samples['logS'] = statsAB['logL_P'] - statsA['logL_P'] - statsB['logL_P']
samples.set_label('logS', r'$\ln\mathcal{S}$')
Expand Down
16 changes: 8 additions & 8 deletions tests/test_tension.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,14 +46,14 @@ def test_tension_stats_compatible_gaussian():
logS_exact = d / 2 - dmu_cov_dmu_AB / 2
assert s.logS.mean() == approx(logS_exact, abs=3*s.logS.std())

logI_exact = logV - d / 2 - slogdet(2*np.pi*(covA+covB))[1] / 2
assert s.logI.mean() == approx(logI_exact, abs=3*s.logI.std())
I_exact = np.exp(logV - d / 2 - slogdet(2*np.pi*(covA+covB))[1] / 2)
assert s.I.mean() == approx(I_exact, abs=3*s.I.std())

assert s.logS.mean() == approx(s.logR.mean() - s.logI.mean(),
assert s.logS.mean() == approx(s.logR.mean() - np.log(s.I).mean(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And accordingly this should not have the np.log.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukashergt Updated.

abs=3*s.logS.std())

assert s.get_labels().tolist() == ([r'$\ln\mathcal{R}$',
r'$\ln\mathcal{I}$',
r'$\mathcal{I}$',
r'$\ln\mathcal{S}$',
r'$d_\mathrm{G}$',
r'$p$'])
Expand Down Expand Up @@ -106,14 +106,14 @@ def test_tension_stats_incompatible_gaussian():
logS_exact = d / 2 - dmu_cov_dmu_AB / 2
assert s.logS.mean() == approx(logS_exact, abs=3*s.logS.std())

logI_exact = logV - d / 2 - slogdet(2*np.pi*(covA+covB))[1] / 2
assert s.logI.mean() == approx(logI_exact, abs=3*s.logI.std())
I_exact = np.exp(logV - d / 2 - slogdet(2*np.pi*(covA+covB))[1] / 2)
assert s.I.mean() == approx(I_exact, abs=3*s.I.std())

assert s.logS.mean() == approx(s.logR.mean() - s.logI.mean(),
assert s.logS.mean() == approx(s.logR.mean() - np.log(s.I).mean(),
abs=3*s.logS.std())

assert s.get_labels().tolist() == ([r'$\ln\mathcal{R}$',
r'$\ln\mathcal{I}$',
r'$\mathcal{I}$',
r'$\ln\mathcal{S}$',
r'$d_\mathrm{G}$',
r'$p$'])
Expand Down