[Bug]: Different behaviour when running one_simulation(), compared to gempyor-simulate #373

saraloo · 2024-10-29T21:30:21Z

Label

gempyor

Priority Label

low priority

Describe the bug/issue

I have a config with a long chunk of outcomes. Not sure what is going on under the hood here but I get an error when running the config from a python notebook using one_simulation() but do not get any error when running gempyor-simulate from CLI.

To Reproduce

Config is https://github.com/HopkinsIDD/RSV_USA/blob/main/config_rsvnet_2024_vaccination_emcee_A.yml

When running the following command in python

gempyor_simulator = gempyor.GempyorInference(
    config_filepath="config_rsvnet_2024_vaccination_emcee_A.yml",
    run_id="test_run_id",
    prefix="test_prefix/",
    stoch_traj_flag=False,
    autowrite_seir=True,
)
gempyor_simulator.one_simulation(sim_id2write=0)

I get the following error

ValueError: ERROR with outcome incidH_age65to100: the specified source incidH_age65to100_v2 is not a dictionnary (for seir outcome) nor an existing pre-identified outcomes.

But instead, running the following works with no errors

gempyor-simulate -c config_rsvnet_2024_vaccination_emcee_A.yml -n 1

Environment, if relevant

No response

The text was updated successfully, but these errors were encountered:

jcblemai · 2024-12-02T18:45:45Z

The link to the config does not work

saraloo · 2024-12-02T19:30:22Z

updated, sorry

jcblemai · 2024-12-02T19:41:27Z

That's interesting because the path of outcomes is roughly the same in the two cases:

parameters = read_parameters_from_config(modinf)
[NPI, loaded values, not relevant here]

and finally

        outcomes_df, hpar = compute_all_multioutcomes(
            modinf=modinf,
            sim_id2write=sim_id2write,
            parameters=parameters,
            loaded_values=loaded_values,
            npi=npi_outcomes,
        )

Though in the one_simulation case, the dictionary that is called parameters is not ordered in the same order as the config:

>>>  print(gempyor_simulator.outcomes_parameters.keys())
dict_keys(['incidH_age65to100', 'incidH_age50to64', 'incidH_age18to49', 'incidH_age5to17', 'incidH_age1to4', 'incidH_age0to11m', 'incidH_age65to100_v2', 'incidH_age50to64_v2', 'incidH_age18to49_v2', 'incidH_age5to17_v2', 'incidH_age1to4_v2', 'incidH_age0to11m_v2', 'incidH_0prior_age0to2m', 'incidH_1prior_age0to2m', 'incidH_2prior_age0to2m', 'incidH_0prior_age3to5m', 'incidH_1prior_age3to5m', 'incidH_2prior_age3to5m', 'incidH_0prior_age6to11m', 'incidH_1prior_age6to11m', 'incidH_2prior_age6to11m', 'incidH_0prior_age1to2', 'incidH_1prior_age1to2', 'incidH_2prior_age1to2', 'incidH_0prior_age3to4', 'incidH_1prior_age3to4', 'incidH_2prior_age3to4', 'incidH_0prior_age5to17', 'incidH_1prior_age5to17', 'incidH_2prior_age5to17', 'incidH_0prior_age18to49', 'incidH_1prior_age18to49', 'incidH_2prior_age18to49', 'incidH_0prior_age50to64', 'incidH_1prior_age50to64', 'incidH_2prior_age50to64', 'incidH_0prior_age65to100_none', 'incidH_0prior_age65to100_vaccsenior', 'incidH_0prior_age65to100_waned', 'incidH_1prior_age65to100_none', 'incidH_1prior_age65to100_vaccsenior', 'incidH_1prior_age65to100_waned', 'incidH_2prior_age65to100_none', 'incidH_2prior_age65to100_vaccsenior', 'incidH_2prior_age65to100_waned', 'incidH_age0to11m_v1', 'incidH_age1to4_v1', 'incidH_age5to17_v1', 'incidH_age18to49_v1', 'incidH_age50to64_v1', 'incidH_age65to100_v1', 'incidH'])

which cause this error. I have a zoom and will dig after what is causing this.

saraloo · 2024-12-02T21:50:41Z

Oh, weird. Thanks for looking in to it. I had made sure to write it in order top to bottom of the config chunk. I'll try and dig into this too now that it's clearer where the error is.

jcblemai · 2024-12-03T06:06:06Z

No no this is clearly a gempyor issue. what happens is

So both endpoints run the same, the only difference is the context in which it is run. Going through one_simulation, the parameter reading:

outcomes_parameters = outcomes.read_parameters_from_config(modinf)

is done two times instead of one:

once to generate the static arguments for EMCEE
a second time to build the model for classical inference

(there should just be one call but that's another issue).

Adding a print to these parameter keys (it's a dictionary) shows us that the first call returns the outcomes in the right order while the second has them shuffled.

I strongly believe this is caused by

def read_parameters_from_config(modinf: model_info.ModelInfo):
        |...|
        outcomes_config = modinf.outcomes_config["outcomes"]
        |...|
                if outcomes_config[new_comp]["delay"].exists():
                    |...|
                else:
                    logging.critical(f"No delay for outcome {new_comp}, using a 0 delay")
                    outcomes_config[new_comp]["delay"] = {"value": 0}
                    |...|

This line outcomes_config[new_comp]["delay"] = {"value": 0} modifies the underlying confuse object (everything here is passed by reference)... I think it causes a change of order and places the modified outcomes at the beginning of the object instead of modifying it in place. Very annoying. Indeed the outcomes affected are ones without delay so the delay is put to 0:

>>> gempyor_simulator.one_simulation(sim_id2write=0)
CRITICAL:root:No delay for outcome incidH_age0to11m_v2, using a 0 delay
CRITICAL:root:No delay for outcome incidH_age1to4_v2, using a 0 delay
CRITICAL:root:No delay for outcome incidH_age5to17_v2, using a 0 delay
CRITICAL:root:No delay for outcome incidH_age18to49_v2, using a 0 delay
CRITICAL:root:No delay for outcome incidH_age50to64_v2, using a 0 delay
CRITICAL:root:No delay for outcome incidH_age65to100_v2, using a 0 delay
CRITICAL:root:No delay for outcome incidH_age0to11m, using a 0 delay
CRITICAL:root:No delay for outcome incidH_age1to4, using a 0 delay
CRITICAL:root:No delay for outcome incidH_age5to17, using a 0 delay
CRITICAL:root:No delay for outcome incidH_age18to49, using a 0 delay
CRITICAL:root:No delay for outcome incidH_age50to64, using a 0 delay
CRITICAL:root:No delay for outcome incidH_age65to100, using a 0 delay

This is a big bug -- I guess we just encountered it now because we never ran two times these commands (within emcee, or within classical inference, the result is cached). So to fix this, you could use emcee_batch as it is and add a delay: 0 to these outcomes, or I've made a quick fix in the branch emcee_batch_fix, checked that it would work in your case. I'll let @pearsonca @TimothyWillard decide what to do about it. It's not a great fix, see #408.

thanks for including very easy commands to reproduce.

Move the call to `get_static_arguments` under the 'emcee' inference branch. This avoids calling `outcomes.read_parameters_from_config` twice. That function has a side effect of editing the outcome modifiers config. See #373 (comment) for details.

saraloo · 2024-12-05T18:46:15Z

Thanks joseph - this is a super clear explanation and really helpful to understand the code. The quick fix worked fine for me.

saraloo added gempyor Concerns the Python core. low priority Low priority. labels Oct 29, 2024

jcblemai added a commit that referenced this issue Dec 3, 2024

fix #373 dirtily

b450f2d

jcblemai mentioned this issue Dec 3, 2024

Emcee batch fix #408

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Different behaviour when running one_simulation(), compared to gempyor-simulate #373

[Bug]: Different behaviour when running one_simulation(), compared to gempyor-simulate #373

saraloo commented Oct 29, 2024 •

edited

Loading

jcblemai commented Dec 2, 2024

saraloo commented Dec 2, 2024

jcblemai commented Dec 2, 2024

saraloo commented Dec 2, 2024

jcblemai commented Dec 3, 2024

saraloo commented Dec 5, 2024

[Bug]: Different behaviour when running one_simulation(), compared to gempyor-simulate #373

[Bug]: Different behaviour when running one_simulation(), compared to gempyor-simulate #373

Comments

saraloo commented Oct 29, 2024 • edited Loading

Label

Priority Label

Describe the bug/issue

To Reproduce

Environment, if relevant

jcblemai commented Dec 2, 2024

saraloo commented Dec 2, 2024

jcblemai commented Dec 2, 2024

saraloo commented Dec 2, 2024

jcblemai commented Dec 3, 2024

saraloo commented Dec 5, 2024

saraloo commented Oct 29, 2024 •

edited

Loading