Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unify bootstrap procedures #7

Open
urbach opened this issue Feb 2, 2015 · 8 comments
Open

unify bootstrap procedures #7

urbach opened this issue Feb 2, 2015 · 8 comments

Comments

@urbach
Copy link
Member

urbach commented Feb 2, 2015

I think we should unify all bootstrap methods. Currently its a bit of a mess, and its never clear how the bootstrap data type is called. Maybe we should have $t and $t0 everywhere like in the boot package?

@kostrzewa
Copy link
Member

You certainly won't hear a complaint from me regarding this idea! In the same spirit I guess one would need to bring in a little bit more nesting, going from cf$cf0, cf$fps and cf$cf.tsboot, cf$fps.tsboot to something like cf$mean$t, cf$fps$t (and the corresponding t0)

Another thing that might be useful (although it must be optional because it uses tons of memory) would be to have access to the actual bootstrap samples of the correlation function rather than just the bootstrap of the mean (hence my rename above).

@kostrzewa
Copy link
Member

I think we can close this, it has been (mostly) done in urbach/hadron

@martin-ueding
Copy link
Contributor

Another issue that I have noticed are the bootstrap parameters in functions. A lot of functions take boot.R, boot.l and seed with default parameters. If one forgets to set these manually and there are no bootstrap samples associated with the quantity at hand, the function will generate the samples using the default parameters. This makes it hard to realize that the parameters have been missing.

Perhaps it would be better to use a global option for this instead? Alternatively one could remove the default values everywhere, but that would force users to specify them even when already have bootstrapped the values.

@martin-ueding
Copy link
Contributor

Though in R there is no problem with missing parameters until one actually uses them. So I think it would be doable to remove the default values and then if a code path with bootstrap is chosen, R would complain about the missing values.

@kostrzewa
Copy link
Member

kostrzewa commented Jan 29, 2018

I also think that in many cases the defaults hurt more than they help, as the code tries to be needlessly helpful. As for "already bootstrapped data", all the objects can tell you if they are bootstrapped or not. As a result, the logic chain should be:

if( !any(names(data) == "bootstrap.samples") ){
  if( missing(boot.R) | missing(boot.L) ){
   stop( "boot.R, boot.l must be specified")
  }
} else {
  boot.R <- data$boot.R
  boot.l <- data$boot.l
}

The generalisation to multiple data sets which are individually bootstrapped is relatively straightforward by comparisons of boot.R and boot.l between all data sets. Unfortunately, they way it is set up now, this does not prevent the situation where the samples are actually different even though the parameters are the same.

For this one would need to be able to relate all bootstrap samples back to the raw data that they were sampled from. If the raw data has unique indexing for each measurement, this unique indexing will carry through to the bootstrap samples, thus making it impossible that incompatible bootstrap samples are ever combined.

Unfortunately, all raw data is presently not endowed with unique IDs... I do believe that this should be done rather soon. It will break many existing analysis codes and data file formats which don't specify a measurement index (configuration number), but the gains in safety warrant breaking backwards-compatibility...

@urbach
Copy link
Member Author

urbach commented Jan 30, 2018 via email

@martin-ueding
Copy link
Contributor

Actually I think that there should only be bootstrap.cf having bootstrap parameters and that is it. All functions that work on correlation functions (effective mass, fit, gevp) just take the correlation function and work with it. If it has bootstrap samples. they are used, if not, not.

@kostrzewa
Copy link
Member

Actually I think that there should only be bootstrap.cf having bootstrap parameters and that is it. All functions that work on correlation functions (effective mass, fit, gevp) just take the correlation function and work with it. If it has bootstrap samples. they are used, if not, not.

I agree, it's certainly a cleaner solution. However, there is no "if not, not" since most functionality will not work without bootstrap samples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants