forked from CarrKnight/freelunch
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
180 lines (129 loc) · 6.55 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
---
title: "freelunch"
output: github_document
editor_options:
chunk_output_type: console
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE,
collapse = TRUE,
comment = "#>")
```
## An R-package for estimating simulation models
This package provides a common interface for a bunch of "reference-table" estimation methods to estimate simulation models.
Reference table methods (i.e. run the model a bunch of times at random then figure out how the real data matches the original parameters) are very inefficient but they have three big advantages:
1. They quickly produce *confidence intervals* for estimated parameters
2. They are **easily testable** by cross-validation
3. They can all be ranked using the same random samples of input-outputs from the model
## Installation
Install from github with the `devtools` package
```{r, eval=FALSE}
##Might need to uncomment this line on windows/mac
##if you don't build packages from sources:
##Sys.setenv("R_REMOTES_NO_ERRORS_FROM_WARNINGS" = "true")
devtools::install_github("carrknight/freelunch")
```
This package has a lot of dependecies, so the installation will take a while. Sorry!
## Very quick guide
### Run a model at random
Imagine a simple 2 parameters (`paramone`,`paramtwo`) simulation model with two summary statistics (`ssone` and `sstwo`). Let's run it 5,000 times with random inputs:
```{r}
set.seed(0)
library(tidyverse)
paramone<-rnorm(n=5000)
paramtwo<-runif(n=5000,min=2,max=5)
ssone<-2*paramone + rnorm(n=5000)
sstwo<- paramone/paramtwo + rnorm(n=5000)
training_data<-
data.frame(
paramone,
paramtwo,
ssone,
sstwo
)
glimpse(training_data)
```
Let's also assume that we have some real data that says that `ssone` is 2 and `sstwo` is 0.25
### Fitting a simulated model
The library `freelunch` comes with a bunch of regression/abc facades (calling other packages made by better people).
In this package the methods to estimate parameters all start with `fit_*` and they all have the same interface requiring 4 arguments:
```{r, eval=FALSE}
fit_rejection_abc(training_runs = ...,
target_runs = ...,
parameter_colnames =...,
summary_statistics_colnames = ...
)
fit_loclinear_abc(...)
fit_semiauto_abc(...)
fit_neural_network_abc(...)
fit_linear_regression(...)
fit_gam(...)
fit_quantile_random_forest(...)
fit_random_forest(...)
fit_gam(...)
```
The four parameters needed are just the `training_runs` (data-frame), the real data observed `target_runs` (can be a vector or a data.frame), the column names that refer to the parameter `parameter_colnames` and the column names that refer to `summary_statistics_colnames`.
So let's just try and fit a classic rejection abc (using the `abc` package) and we quickly get the estimated parameters as well as bounds around them:
```{r}
library(freelunch)
estimation<-
fit_rejection_abc(training_runs = training_data,
target_runs = c(2,0.25),
parameter_colnames = c("paramone","paramtwo"),
summary_statistics_colnames = c("ssone","sstwo")
)
tidy_up_estimation(estimation) %>% knitr::kable()
```
Looking at these results, you must resist the urge of just looking at the estimate.
If you do, which is unfortunately common, you say: `paramone` is 0.74 and `paramtwo` is 3.43.
Once you look at the confidence intervals you get a better picture: `paramone` can be recovered somewhat (somewhere between -0.2 and 1.7) but `paramtwo` bounds are basically anything from 2 to 5 which are suspiciously similar to the original bounds.
The key here is **never trust estimates without testing**, particularly point estimates!. Fortunately the package provides cross-validation helpers for testing.
## Should we trust our fit? Test!
The testing methods in this package all start with `cross_validate_*` and have the same interface:
```{r, eval=FALSE}
cross_validate_rejection_abc(total_data = ...,
parameter_colnames = ...,
summary_statistics_colnames = ...
)
cross_validate_loclinear_abc(...)
cross_validate_semiauto_abc(...)
cross_validate_neural_network_abc(...)
cross_validate_linear_regression(...)
cross_validate_gam(...)
cross_validate_quantile_random_forest(...)
cross_validate_random_forest(...)
cross_validate_gam(...)
```
What happens here is that we split the runs into some (5 by default) groups and in turn we pretend not to know the input parameters of one group (we treat it as "real" data) and then see if the estimation method is any good.
The arguments are `total_data` which is just the training data, and the column names like for `fit_*` calls.
```{r, message=FALSE}
abc.cv<-
cross_validate_rejection_abc(total_data = training_data,
parameter_colnames = c("paramone","paramtwo"),
summary_statistics_colnames = c("ssone","sstwo"),
#default parameters:
ngroup=5,cv_seed = 0
)
```
If we look at performance<sup id="a1">[1](#f1)</sup> (which goes from 0, our point predictions are awful, to 1, point predictions are perfect) we see that we do okay with `paramone` but `paramtwo` predictions are simply useless:
```{r}
abc.cv$performance
```
however, coverage (out-of-sample probability that the real parameter is in the 95% CI) is quite accurate (target here is 0.95):
```{r}
abc.cv$contained
```
Summaries however only get you so far, sometimes it's better to look at the effect visually
## Plots
Two plot commands can help understand the quality of our estimations and cross-validation. First, we can look at confidence intervals (black error bars), how wide they are and how often the real parameter (red dots) is contained within them.
Ideally what we want is very small confidence intervals with the red dots in them about 95% of the times.
```{r}
abc.cv %>% plot_confidence_intervals(limit_number_of_runs_to_plot_to = 120)
```
This plot shows how `paramone` tends to have quite large confidence intervals but at least they are usually correct; for `paramtwo` the abc is very wisely deciding NOT to make a real prediction.
And we can also look narrowly at point predictions and see how well they stack up (best-case scenario all the dots are on the 45 degree line):
```{r}
abc.cv %>% plot_point_prediction_quality()
```
And if the second plot doesn't convince you that `paramtwo` is **NOT IDENTIFIED** then nothing else will.
<b id="f1">1</b> Performance is just 1 - (mean square root error of the estimation)/(mean square root of just using the average) [↩](#a1)