Large number of divergences in simple model #306

eort · 2023-10-25T14:40:34Z

eort
Oct 25, 2023

Dear all,

I have some troubles getting reasonable fits of my models. Essentially, the chains get stuck at the starting point. At first, I thought it might be related to the specific regressions that I run, but the same happened with the standard DDM:

model = hssm.HSSM(data=data, z=0.5, model='ddm', loglik=logp_ddm,
    loglik_kind='analytical', hierarchical=True,
    include=[
        {
            "name": "v",
            "prior": {"name": "Uniform", "lower": 0, "upper": 10.0}
        },
                {
            "name": "a",
            "prior": {"name": "Uniform", "lower": 0.2, "upper": 5.0}
        },
                {
            "name": "t",
            "prior": {"name": "Uniform", "lower": 0.01, "upper": 1.0}
        }
    ])
InfData = model.sample(
    cores=5,
    chains=5,
    draws=2000,
    tune=3000,
    idata_kwargs=dict(log_likelihood=True))

Note, I also tried approx_differentiable as loglik_kind.

The log output of that model included following lines:

2023-10-19 21:34:41,513 Multiprocess sampling (5 chains in 5 jobs)
2023-10-19 21:34:41,513 NUTS: [a, t, v]
2023-10-20 17:30:58,433 Sampling 5 chains for 3_000 tune and 2_000 draw iterations (15_000 + 10_000 draws total) took 71776 seconds.
2023-10-20 17:47:41,460 The rhat statistic is larger than 1.01 for some parameters. This indicates problems during sampling. See https://arxiv.org/abs/1903.08008 for details
2023-10-20 17:47:41,460 The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details
2023-10-20 17:47:41,460 There were 10000 divergences after tuning. Increase `target_accept` or reparameterize.

Not sure whether this is normal, but I also noticed that the divergences concerned only draws taken after the initial tuning phase (0 divergences after 15000 samples, 10000 after 25000).

Do you have any idea what could be the problem here? If helpful, I can also share the data. Oh, and in case it matters, my hssm installation is based on this branch git+https://github.com/lnccbrown/HSSM.git@280-pin-numpy-version (see #265)

AlexanderFengler · 2023-10-25T22:21:01Z

AlexanderFengler
Oct 25, 2023
Maintainer

Hi @eort,

this is very likely tied to issues with initialization of the t parameter.
Could you supply also initval = 0.01 to the prior and see if this resolves it?

If not, could you also show the traces so we have a better picture of what is going on?

1 reply

eort Oct 31, 2023
Author

Hi @AlexanderFengler,

Thanks for your input! It does indeed seem to improve things slightly. The log output looks like this:

2023-10-31 01:06:40,162 NUTS: [t, a, v]
2023-10-31 03:50:25,187 Sampling 5 chains for 3_000 tune and 2_000 draw iterations (15_000 + 10_000 draws total) took 9824 seconds.
2023-10-31 04:01:40,377 The rhat statistic is larger than 1.01 for some parameters. This indicates problems during sampling. See https://arxiv.org/abs/1903.08008 for details
2023-10-31 04:01:41,805 The effective sample size per chain is smaller than 100 for some parameters.  A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details
2023-10-31 04:01:41,805 There were 8220 divergences after tuning. Increase `target_accept` or reparameterize.

with the model:

model = hssm.HSSM(data=data, z=0.5, model='ddm', loglik=logp_ddm,
    loglik_kind='analytical', hierarchical=True,
    "include":[
        {
            "name": "v",
            "prior": {"name": "Uniform", "lower": 0, "upper": 10.0}
        },
                {
            "name": "a",
            "prior": {"name": "Uniform", "lower": 0.2, "upper": 5.0}
        },
                {
            "name": "t",
            "prior": {"name": "Uniform", "lower": 0.01, "upper": 1.0, "initval":0.01}
        }
    ])

So, still a lot of divergences, even though much less than in the case without initval=0.01
The traces (a, t, v) look like so:
a-indiv_traces.pdf
t-indiv_traces.pdf
v-indiv_traces.pdf

Is this maybe only an issue of #samples?

frankmj · 2023-10-31T11:01:18Z

frankmj
Oct 31, 2023
Maintainer

Can you also try the same model but more constrained priors ? Eg for “a”?cap it at 3 (5 is quite high), and for v is there a reason you constrain it only positive and also up to 10 with uniform prior? You could do a normal with mean 0 and sigma 2 or so ). An issue with uniform priors over a large unlikely parameter space is that you can get lots of samples that are not accepted ; or in some cases maybe it will accept a very large value or one parameter and a correspondingly extreme value to compensate and perhaps lead to more divergence . (Ultimately this should still work but with less effective samples - and there may be something else going on so Alex should still chime in - but I think it would be instructive to know what happens if you change these priors). Finally t is probably ok although depending on your task / data it might actually be reasonable to allow it to go higher (eg up to 2), but for most simple cases t is indeed much less than 1.

…

On Tue, Oct 31, 2023 at 3:30 AM eort ***@***.***> wrote: Hi @AlexanderFengler <https://github.com/AlexanderFengler>, Thanks for your input! It does indeed seem to improve things slightly. The log output looks like this: 2023-10-31 01:06:40,162 NUTS: [t, a, v] 2023-10-31 03:50:25,187 Sampling 5 chains for 3_000 tune and 2_000 draw iterations (15_000 + 10_000 draws total) took 9824 seconds. 2023-10-31 04:01:40,377 The rhat statistic is larger than 1.01 for some parameters. This indicates problems during sampling. See https://arxiv.org/abs/1903.08008 for details 2023-10-31 04:01:41,805 The effective sample size per chain is smaller than 100 for some parameters. A higher number is needed for reliable rhat and ess computation. See https://arxiv.org/abs/1903.08008 for details 2023-10-31 04:01:41,805 There were 8220 divergences after tuning. Increase `target_accept` or reparameterize. with the model: model = hssm.HSSM(data=data, z=0.5, model=model_type, loglik=loglik, loglik_kind=cfg['loglik_kind'], hierarchical=True, "include":[ { "name": "v", "prior": {"name": "Uniform", "lower": 0, "upper": 10.0} }, { "name": "a", "prior": {"name": "Uniform", "lower": 0.2, "upper": 5.0} }, { "name": "t", "prior": {"name": "Uniform", "lower": 0.01, "upper": 1.0, "initval":0.01} } ]) So, still a lot of divergences, even though much less than in the case without initval=0.01 The traces (a, t, v) look like so: a-indiv_traces.pdf <https://github.com/lnccbrown/HSSM/files/13213310/a-indiv_traces.pdf> [t-indiv_traces.pdf](https://github.com/lnccbr v-indiv_traces.pdf <https://github.com/lnccbrown/HSSM/files/13213313/v-indiv_traces.pdf> own/HSSM/files/13213312/t-indiv_traces.pdf) Is this maybe only an issue of #samples? — Reply to this email directly, view it on GitHub <#306 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAG7TFDDL2SGMNUAQALREJDYCCSHRAVCNFSM6AAAAAA6PQG4SCVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TIMZRHA4TI> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

2 replies

eort Oct 31, 2023
Author

for v is there a reason you constrain it only positive and also up to 10 with uniform prior?

That was a left-over setting from a regression model where an experimental variable had a weight of 6 on v. But you're right, for this simple model a Gaussian prior sounds more reasonable.

I'll get back to you, once the updated model is finished.

eort Oct 31, 2023
Author

Okay, using adjusted priors, improved the situation somewhat. The diagnostics almost suggest convergence, however, there are still mostly divergences in the traces (9568 out of 10000 samples after 15000 tuning across 5 chains):
a-indiv_traces.pdf
t-indiv_traces.pdf
v-indiv_traces.pdf

Is that something that is in the "normal" range of bad models due to my data / not enough sampling / etc? Or is another more profound issue here? Apart from that, the priors indeed matter a lot. Perhaps, I missed it, but is there a place in the docs or elsewhere that gives advice on how to choose priors?

frankmj · 2023-10-31T16:24:11Z

frankmj
Oct 31, 2023
Maintainer

Those traces don't look too bad to me. my guess is that the divergences might relate to the hierarchical model - not sure what the priors are used for the variance across the group distribution but that can sometimes cause a lot of problems if it is too large. You are right that there is not much documentation yet on choosing priors and this is something we will try to include with some of the newer releases (the team has a bunch of updates coming!) Do you get divergences if you just run a single subject?

…

On Tue, Oct 31, 2023 at 11:28 AM eort ***@***.***> wrote: Okay, using adjusted priors, improved the situation somewhat. The diagnostics almost suggest convergence, however, there are still mostly divergences in the traces (9568 out of 10000 samples after 15000 tuning across 5 chains): a-indiv_traces.pdf <https://github.com/lnccbrown/HSSM/files/13218146/a-indiv_traces.pdf> t-indiv_traces.pdf <https://github.com/lnccbrown/HSSM/files/13218147/t-indiv_traces.pdf> v-indiv_traces.pdf <https://github.com/lnccbrown/HSSM/files/13218148/v-indiv_traces.pdf> Is that something that is in the "normal" range of bad models due to my data / not enough sampling / etc? Or is another more profound issue here? Apart from that, the priors indeed matter a lot. Perhaps, I missed it, but is there a place in the docs or elsewhere that gives advice on how to choose priors? — Reply to this email directly, view it on GitHub <#306 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAG7TFHWYQNGITYLBQPTJTTYCEKJJAVCNFSM6AAAAAA6PQG4SCVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TIMZWG4ZDK> . You are receiving this because you commented.Message ID: ***@***.***>

2 replies

eort Oct 31, 2023
Author

Yes, again roughly 90-95% of samples are divergences. To be clear, I subselected a random subject, and set hierarchical=False. The rest was unchanged.

eort Nov 10, 2023
Author

Hi,
Do you have any more ideas on how to resolve this? In case I can help you guys in narrowing down the problem (trying out more things, sharing the data, etc) please let me know!

Thanks again for the help so far!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large number of divergences in simple model #306

{{title}}

Replies: 3 comments 5 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Large number of divergences in simple model #306

eort Oct 25, 2023

Replies: 3 comments · 5 replies

AlexanderFengler Oct 25, 2023 Maintainer

eort Oct 31, 2023 Author

frankmj Oct 31, 2023 Maintainer

eort Oct 31, 2023 Author

eort Oct 31, 2023 Author

frankmj Oct 31, 2023 Maintainer

eort Oct 31, 2023 Author

eort Nov 10, 2023 Author

eort
Oct 25, 2023

Replies: 3 comments 5 replies

AlexanderFengler
Oct 25, 2023
Maintainer

eort Oct 31, 2023
Author

frankmj
Oct 31, 2023
Maintainer

eort Oct 31, 2023
Author

eort Oct 31, 2023
Author

frankmj
Oct 31, 2023
Maintainer

eort Oct 31, 2023
Author

eort Nov 10, 2023
Author