Add k-band and iid energy score estimator #25

simon-hirsch · 2024-05-26T06:41:22Z

If you have a lot of samples/scenarios/ensembles but little time (or little computational power):

Equations 15 and 16 in Berk & Ziel, 2019

sallen12 · 2024-05-29T10:18:04Z

The iid energy score estimator has been used in a few studies and would be a nice, straightforward addition to the package. I would avoid including the k-band estimator for now - there are some errors in Equation 16 of Berk & Ziel, 2019, and this approximation isn't used elsewhere. The fraction in Equation 15 should also be 2/M rather than M/2, see e.g. Moller et al., 2013, which is important to remember when we implement it.

simon-hirsch · 2024-05-29T11:09:14Z

All right @sallen12. - We've used the $k$ - band estimator e.g. here here. The fraction for the k-band should read 1 / (M * K) - There is even a typo in our paper, where a "-1" sneaked in. I agree though, that it's not the most used method to approximate the ES.
I'll add the IID estimator only then.

sallen12 · 2024-05-29T11:22:46Z

Thanks for the link to the paper. It's probably just a notational misunderstanding then, but in your Eq 26, if K < M, then for any m > K, the lower bound on the second summation, k = m, will be higher than the upper bound, K. In this case, is it assumed that the summation is zero?

simon-hirsch · 2024-05-29T12:01:05Z

Hi, what we did, is essentially taking: $\hat{y}^{[m]} - \hat{y}^{[m+k]}$ for $K$, where $\hat{y}^{[m]}$ is the $m$-th ensemble member. I.e. for $K=1$, you just have $$1 / M \sum^M_{m=1} | \hat{y}^{[m]} - \hat{y}^{[m+1]} |$$, for $K=2$ you get $$1 / (2M) (\sum^M_{m=1} | \hat{y}^{[m]} - \hat{y}^{[m+1]}| + \sum^M_{m=1} | \hat{y}^{[m]} - \hat{y}^{[m+2]} |)$$ and so on. Does that make sense for you?

sallen12 · 2024-05-29T12:58:49Z

Thanks for clarifying. So, just to check I've understood, the second summation in Eq 26 starts from $k = 1$ rather than $k=m$? In this case everything makes sense to me, and I agree it could be a useful estimator to include in the package 👍

As an aside, for the implementation, it would be useful to randomise the ensemble members before applying this formula (also for the iid formula). Otherwise, if there is some sort of ordering among the ensemble members, e.g. because of how the ensembles are sampled from the predictive distribution, then only taking differences between nearby members (as is the case when $K=1$) will generally underestimate the underlying expectation.

simon-hirsch · 2024-05-29T13:14:31Z

Yeah, or it ends at $M+K$ if you want to start at $k=m$. Starting at 1 is less confusing though.

Essentially shuffle the ensembles once? Not sure how this will work with the gufuncs, but generally agree that it makes sense. Will check and also put a seed in the estimator 👍

sallen12 · 2024-05-29T13:22:36Z

Great, all clear now, thanks a lot!

simon-hirsch mentioned this issue May 31, 2024

WIP: Energy score estimators #29

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add k-band and iid energy score estimator #25

Add k-band and iid energy score estimator #25

simon-hirsch commented May 26, 2024 •

edited

Loading

sallen12 commented May 29, 2024 •

edited

Loading

simon-hirsch commented May 29, 2024

sallen12 commented May 29, 2024

simon-hirsch commented May 29, 2024

sallen12 commented May 29, 2024

simon-hirsch commented May 29, 2024

sallen12 commented May 29, 2024

Add k-band and iid energy score estimator #25

Add k-band and iid energy score estimator #25

Comments

simon-hirsch commented May 26, 2024 • edited Loading

sallen12 commented May 29, 2024 • edited Loading

simon-hirsch commented May 29, 2024

sallen12 commented May 29, 2024

simon-hirsch commented May 29, 2024

sallen12 commented May 29, 2024

simon-hirsch commented May 29, 2024

sallen12 commented May 29, 2024

simon-hirsch commented May 26, 2024 •

edited

Loading

sallen12 commented May 29, 2024 •

edited

Loading