Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request]: Method For Directly Sampling From The Posterior Predictive From Output #416

Open
TimothyWillard opened this issue Dec 9, 2024 · 0 comments
Labels
enhancement Request for improvement or addition of new feature(s). gempyor Concerns the Python core. medium priority Medium priority. post-processing Concern the post-processing.

Comments

@TimothyWillard
Copy link
Contributor

Label

enhancement, gempyor, post-processing

Priority Label

medium priority

Is your feature request related to a problem? Please describe.

This issue was originally reported by @MacdonaldJoshuaCaleb in GH-413.

For more complicated or atypical post-processing it is helpful to directly sample the posterior predictive distribution. This allows for operations to develop post-processing that takes the fitted parameters as input and produces a distribution of some derived quantity.

Is your feature request related to a new application, scenario round, pathogen? Please describe.

No response

Describe the solution you'd like

Here's the start of a function like that, also developed for flu scenarios, note that to really generalize this we would want to index by the parameter labels rather than just the ordering . Note that chains can be gotten from a h5 file like so, arviz is another (python) package that can read h5 files:

sampler = emcee.backends.HDFBackend(filename, read_only=True)
chains = sampler.get_chain()

these chains can then be fed to gempyor to simulate the model given a config

 def shuffle_params(chains, idx_array, intersect, keep_list, Num_samples = None, Num_seasons = None, Num_params = None):

    if Num_samples == None:
        Num_samples = 100
    if Num_seasons == None:
        Num_seasons = 3
    if Num_params == None:
        Num_params = 9
        
    samples = chains[-1,:,:]
    shuffled_samples = np.zeros([Num_samples, Num_params])
    shuffled_chains = np.zeros([chains.shape[0], Num_samples, Num_params])
    r0_seasons = []
    indices = []
    for k in range(Num_samples):
        r_season_idx = np.random.randint(0,len(intersect),Num_params)
        r_chain_idx = np.random.randint(0,chains.shape[1],Num_params)
        r0_seasons.append(keep_list[r_season_idx[0]])
        for j in range(Num_params):
            shuffled_samples[k,j] = samples[r_chain_idx[j],idx_array[r_season_idx[j]][j]]
            shuffled_chains[:,k,j] = chains[:,r_chain_idx[j],idx_array[r_season_idx[j]][j]]
            indices.append([r_chain_idx[j],idx_array[r_season_idx[j]][j]])
           


        
    return shuffled_chains, shuffled_samples, indices, np.array(r0_seasons)
######################################################################
# usage 
    gempyor_inference = GempyorInference(
                config_filepath=state_dst_config,
                run_id=run_id,
                prefix=None,
                first_sim_index=1,
                stoch_traj_flag=False,
                rng_seed=None,
                nslots=1,
                inference_filename_prefix="global/final/",  # usually for {global or chimeric}/{intermediate or final}
                inference_filepath_suffix="",  # usually for the slot_id
                out_run_id=None,  # if out_run_id is different from in_run_id, fill this
                out_prefix=None,  # if out_prefix is different from in_prefix, fill this
            # in case the data folder is on another directory
                autowrite_seir=False,
            )
        
           # generate a list of data frames from gempyor
            result = gempyor_inference.simulate_proposal(shuffled_samples[0])
@TimothyWillard TimothyWillard added enhancement Request for improvement or addition of new feature(s). gempyor Concerns the Python core. post-processing Concern the post-processing. medium priority Medium priority. labels Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Request for improvement or addition of new feature(s). gempyor Concerns the Python core. medium priority Medium priority. post-processing Concern the post-processing.
Projects
None yet
Development

No branches or pull requests

1 participant