Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huber's weight parameter for Rodionov function #5

Open
jatalah opened this issue Oct 5, 2022 · 6 comments · May be fixed by #24
Open

Huber's weight parameter for Rodionov function #5

jatalah opened this issue Oct 5, 2022 · 6 comments · May be fixed by #24

Comments

@jatalah
Copy link

jatalah commented Oct 5, 2022

Hi @alexhroom,

Could you implement Huber's weight parameter in the Rodionov function as in Rodionov (2006)?
This function handles outliers based on Huber's weight function (Huber, 2005).
The implementation is described in https://www.beringclimate.noaa.gov/regimes/help3.html.

Many thanks,
Huber, P. J. (2005), Robust estimation of a location parameter, Annals Mathematical Statistics, 35, 73-101
Rodionov, S. N. (2006). The problem of red noise in climate regime shift detection. Geophysical Research Letters, 31, L12707.

@tjgrabs
Copy link

tjgrabs commented Nov 7, 2022

First I'd like to say thanks a LOT for this bringing Rodionovs method to R, @alexhroom ! Without your work it would still be on the brink of oblivion due to the unmaintained excel macro before Then, I agree with @jatalah that it would be great if Huber's weight function could be added. Without the Huber function, the Rodionov's method is really sensitive to outliers as they are often found in real-world data.

Many thanks

@alexhroom
Copy link
Owner

@jatalah @tjgrabs sorry for little progress on this so far - most explanations of how Huber's weight is used is a little less than clear, and the work is being held up by me looking for a better & more mathematical explanation in the literature (e.g. Rodionov's description says that the outliers are handled using Huber's weight, but doesn't particularly explains where this fits into his original 2004 algorithm). furthermore, as I do not have Microsoft Office my access to his excel macro is limited. if anyone can provide a paper describing the algorithm with huber's weight added it would be a much faster task, otherwise it might take a while as I need to find the time to better digest the use of the parameter.

@tjgrabs
Copy link

tjgrabs commented Nov 11, 2022

Dear @alexhroom,
I just discovered that Sergej has a website with a more detailed description of the different algorithms: https://sites.google.com/view/regime-shift-test/home. The site also contains links to different implementations, including the rshift package and a new, updated excel version. I attached extracted code from the excel version, which you might also use to see how the huber weighting was implemented.

ExcalMacroModule_Code_BaseShiftInMean.txt

Apart from this, I tested rshifts on a data set, and noticed two things: 1. The results differ from Rodionovs Excel implementation, even after I use a large Huber parameter (which I think should result in ignoring outliers). 2. Two breakpoints appeared in adjacent years. This is not a behavior I would expect. The results from the Excel plugin shows only a single breakpoint in 1973 and not in 2000. (see attached image).

double_breakpoint

@alexhroom
Copy link
Owner

alexhroom commented Nov 11, 2022

@tjgrabs thank you for this link. the difference in results is a known discrepancy - there’s some minutiae in the original Rodionov paper that allows for room of interpretation, which can cause different implementations to have slightly different sensitivities - it should only affect sensitivity of the algorithm though and shouldn’t be hypothesis-affecting for research. I’m rewriting my implementation to be faster and more modifiable which may fix this - can you please share with me the data set you tested on?

EDIT: This issue should be fixed with rshift v2.20, which uses a Rust implementation that as far as I can tell gives results equivalent to Rodionov's Excel version.

@alexhroom alexhroom linked a pull request May 16, 2024 that will close this issue
@adamkemberling
Copy link

adamkemberling commented Dec 4, 2024

There is a GitHub repo for the Stirnimann 2019 paper that has code for the STARS regime shift test, weighted averages using Huber's weight parameter, and the various pre whitening routines. The last commit was from 5 years ago, but the foundational pieces are there.

@alexhroom
Copy link
Owner

@adamkemberling I'd been working on the Huber's weight things in issue #24. The main snag I ran into was being unable to find some good test data to check whether I'd done it correctly, and then I didn't have any time to work on it for a while. I don't have access to Microsoft Excel so I can't get some results from the Excel macro for comparison!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants