forked from RePsychLing/SMLP2022
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathshrinkageplot.qmd
232 lines (187 loc) · 5.37 KB
/
shrinkageplot.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
---
title: "More on shrinkage plots"
jupyter: julia-1.8
---
- I have stated that the likelihood criterion used to fit linear mixed-effects can be considered as balancing fidelity to the data (i.e. fits the observed data well) versus model complexity.
- This is similar to some of the criterion used in Machine Learning (ML), except that the criterion for LMMs has a rigorous mathematical basis.
- In the shrinkage plot we consider the values of the random-effects coefficients for the fitted values of the model versus those from a model in which there is no penalty for model complexity.
- If there is strong subject-to-subject variation then the model fit will tend to values of the random effects similar to those without a penalty on complexity.
- If the random effects term is not contributing much (i.e. it is "inert") then the random effects will be *shrunk* considerably towards zero in some directions.
```{julia}
#| code-fold: true
using CairoMakie
using DataFrames
using LinearAlgebra
using MixedModels
using MixedModelsMakie
using Random
using ProgressMeter
ProgressMeter.ijulia_behavior(:clear);
```
Load the kb07 data set (don't tell Reinhold that I used these data).
```{julia}
kb07 = MixedModels.dataset(:kb07)
```
```{julia}
contrasts = Dict(
:subj => Grouping(),
:item => Grouping(),
:spkr => HelmertCoding(),
:prec => HelmertCoding(),
:load => HelmertCoding(),
)
m1 = let
form = @formula(
rt_trunc ~
1 +
spkr * prec * load +
(1 + spkr + prec + load | subj) +
(1 + spkr + prec + load | item)
)
fit(MixedModel, form, kb07; contrasts)
end
```
```{julia}
VarCorr(m1)
```
```{julia}
issingular(m1)
```
```{julia}
print(m1)
```
## Expressing the covariance of random effects
Earlier today we mentioned that the parameters being optimized are from a "matrix square root" of the covariance matrix for the random effects.
There is one such lower triangular matrix for each grouping factor.
```{julia}
l1 = first(m1.λ) # Cholesky factor of relative covariance for subj
```
Notice the zero on the diagonal. A triangular matrix with zeros on the diagonal is singular.
```{julia}
l2 = last(m1.λ) # this one is not singular
```
To regenerate the covariance matrix we need to know that the covariance is not the *square* of `l1`, it is `l1 * l1'` (so that the result is symmetric) and multiplied by σ̂²
```{julia}
Σ₁ = varest(m1) .* (l1 * l1')
```
```{julia}
diag(Σ₁) # compare to the variance column in the VarCorr output
```
```{julia}
sqrt.(diag(Σ₁))
```
## Shrinkage plots
```{julia}
#| label: fig-m1shrinkage
#| fig-cap: Shrinkage plot of model m1
#| code-fold: true
shrinkageplot(m1)
```
The upper left panel shows the perfect negative correlation for those two components of the random effects.
```{julia}
shrinkageplot(m1, :item)
```
```{julia}
X1 = Int.(m1.X')
```
```{julia}
X1 * X1'
```
## How to interpret a shrinkage plot
- Extreme shrinkage (shrunk to a line or to a point) is easy to interpret - the term is not providing benefit and can be removed.
- When the range of the blue dots (shrunk values) is comparable to those of the red dots (unshrunk) it indicates that the term after shrinkage is about as strong as without shrinkage.
- By itself, this doesn't mean that the term is important. In some ways you need to get a feeling for the absolute magnitude of the random effects in addition to the relative magnitude.
- Small magnitude and small relative magnitude indicate you can drop that term
## Conclusions from these plots
- Only the intercept for the `subj` appears to be contributing explanatory power
- For the `item` both the intercept and the `spkr` appear to be contributing
```{julia}
m2 = let
form = @formula(
rt_trunc ~
1 + prec * spkr * load + (1 | subj) + (1 + prec | item)
)
fit(MixedModel, form, kb07; contrasts)
end
```
```{julia}
VarCorr(m2)
```
```{julia}
#| fig-cap: Shrinkage plot of model m2
#| label: fig-m2shrinkage
#| code-fold: true
shrinkageplot(m2)
```
```{julia}
#| lst-label: m1def
m3 = let
form = @formula(
rt_trunc ~
1 + prec + spkr + load + (1 | subj) + (1 + prec | item)
)
fit(MixedModel, form, kb07; contrasts)
end
```
```{julia}
VarCorr(m3)
```
```{julia}
rng = Random.seed!(1234321);
```
```{julia}
m3btstrp = parametricbootstrap(rng, 2000, m3);
```
```{julia}
DataFrame(shortestcovint(m3btstrp))
```
```{julia}
#| label: fig-ridgeplot
#| fig-cap: 'Ridge plot of the fixed-effects coefficients from the bootstrap sample'
ridgeplot(m3btstrp)
```
```{julia}
#| label: fig-ridgeplotnoint
#| fig-cap: 'Ridge plot of the fixed-effects coefficients from the bootstrap sample (with the intercept)'
#| eval: false
ridgeplot(m3btstrp; show_intercept=false)
```
```{julia}
m4 = let
form = @formula(
rt_trunc ~
1 + prec + spkr + load + (1 + prec | item) + (1 | subj)
)
fit(MixedModel, form, kb07; contrasts)
end
```
```{julia}
m4bstrp = parametricbootstrap(rng, 2000, m4);
```
```{julia}
ridgeplot(m4bstrp; show_intercept=false)
```
```{julia}
DataFrame(shortestcovint(m4bstrp))
```
```{julia}
VarCorr(m4)
```
```{julia}
#| code-fold: true
let mods = [m1, m2, m4]
DataFrame(;
geomdof=(sum ∘ leverage).(mods),
npar=dof.(mods),
deviance=deviance.(mods),
AIC=aic.(mods),
BIC=bic.(mods),
AICc=aicc.(mods),
)
end
```
```{julia}
#| label: fig-scatterm4
#| fig-cap: Residuals versus fitted values for model m4
scatter(fitted(m4), residuals(m4))
```