From 188bec16797ec0c920ccdb199b449f868639ae83 Mon Sep 17 00:00:00 2001 From: Timothy Willard <9395586+TimothyWillard@users.noreply.github.com> Date: Fri, 8 Nov 2024 11:40:15 -0500 Subject: [PATCH 1/2] Minor edits to `black` docs Came from actually reviewing and using the docs with @emprzy which highlighted some short comings. --- .../gitbook/development/python-guidelines-for-developers.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/documentation/gitbook/development/python-guidelines-for-developers.md b/documentation/gitbook/development/python-guidelines-for-developers.md index 75a7e60ab..3e677e3c8 100644 --- a/documentation/gitbook/development/python-guidelines-for-developers.md +++ b/documentation/gitbook/development/python-guidelines-for-developers.md @@ -65,7 +65,7 @@ For more details on how to use `pytest` please refer to their [usage guide](http ### Formatting -We try to remain close to Python conventions and to follow the updated rules and best practices. For formatting, we use [black](https://github.com/psf/black), the _Uncompromising Code Formatter_ before submitting pull requests. It provides a consistent style, which is useful when diffing. We use a custom length of 92 characters as the baseline is short for scientific code. Here is the line to use to format your code: +We try to remain close to Python conventions and to follow the updated rules and best practices. For formatting, we use [black](https://github.com/psf/black), the _Uncompromising Code Formatter_ before submitting pull requests. It provides a consistent style, which is useful when diffing. To get started with black please refer to their [Getting Started guide](https://black.readthedocs.io/en/stable/getting_started.html). We use a custom length of 92 characters as the baseline is short for scientific code. Here is the line to use to format your code: ```bash black --line-length 92 \ @@ -73,7 +73,7 @@ black --line-length 92 \ --verbose . ``` -For those using a Mac or Linux system for development this command is also available for use by calling `./dev/lint`. Similarly, you can take advantage of the formatting pre-commit hook found at `bin/pre-commit`. To start using it copy this file to your git hooks folder: +For those using a Mac or Linux system for development this command is also available for use by calling `./bin/lint`. Similarly, you can take advantage of the formatting pre-commit hook found at `bin/pre-commit`. To start using it copy this file to your git hooks folder: ```bash cp -f bin/pre-commit .git/hooks/ From f7d19eb56557a0311c33c523b7c6f1aa2657adaf Mon Sep 17 00:00:00 2001 From: Alison Hill <34223923+alsnhll@users.noreply.github.com> Date: Tue, 26 Nov 2024 13:22:14 -0500 Subject: [PATCH 2/2] Updated sample/tutorial files --- examples/tutorials/README.md | 2 + .../config_sample_2pop_inference.yml | 4 +- .../config_sample_2pop_interventions_test.yml | 66 -- .../config_sample_2pop_modifiers.yml | 4 +- .../config_sample_2pop_vaccine_scenarios.yml | 168 +++++ .../config_sample_2pop_2var_interventions.yml | 91 --- .../config_sample_2pop_interventions_test.yml | 66 -- examples/tutorials/inprepconfigs/pieces.yaml | 18 - .../tutorials/model_input/make_test_data.R | 176 +++-- examples/tutorials/model_output_notebook.Rmd | 6 +- .../model_output_notebook_inference.html | 682 ++++++++++++++++++ 11 files changed, 979 insertions(+), 304 deletions(-) delete mode 100644 examples/tutorials/config_sample_2pop_interventions_test.yml create mode 100644 examples/tutorials/config_sample_2pop_vaccine_scenarios.yml delete mode 100644 examples/tutorials/inprepconfigs/config_sample_2pop_2var_interventions.yml delete mode 100644 examples/tutorials/inprepconfigs/config_sample_2pop_interventions_test.yml delete mode 100644 examples/tutorials/inprepconfigs/pieces.yaml create mode 100644 examples/tutorials/model_output_notebook_inference.html diff --git a/examples/tutorials/README.md b/examples/tutorials/README.md index 628d49076..10d15c372 100644 --- a/examples/tutorials/README.md +++ b/examples/tutorials/README.md @@ -1 +1,3 @@ # flepimop_sample + +This repository mirrors the contents in the **examples/tutorial** folder of the FlepiMoP repository ([link](https://github.com/HopkinsIDD/flepiMoP/tree/main/examples/tutorials)). It can be used to try out running simple projects using `flepimop` code and as a template for new projects. diff --git a/examples/tutorials/config_sample_2pop_inference.yml b/examples/tutorials/config_sample_2pop_inference.yml index 2a3e8814c..3b9944455 100644 --- a/examples/tutorials/config_sample_2pop_inference.yml +++ b/examples/tutorials/config_sample_2pop_inference.yml @@ -90,7 +90,7 @@ seir_modifiers: method: SinglePeriodModifier parameter: Ro period_start_date: 2020-05-01 - period_end_date: 2020-07-01 + period_end_date: 2020-08-31 subpop: "all" value: distribution: truncnorm @@ -145,7 +145,7 @@ outcome_modifiers: # assume that due to limitations in testing, initially the case detection probability was lower test_limits: method: SinglePeriodModifier - parameter: incidCase + parameter: incidCase::probability subpop: "all" period_start_date: 2020-02-01 period_end_date: 2020-06-01 diff --git a/examples/tutorials/config_sample_2pop_interventions_test.yml b/examples/tutorials/config_sample_2pop_interventions_test.yml deleted file mode 100644 index 1e12f1d2d..000000000 --- a/examples/tutorials/config_sample_2pop_interventions_test.yml +++ /dev/null @@ -1,66 +0,0 @@ -name: sample_2pop -setup_name: minimal -start_date: 2020-01-31 -end_date: 2020-05-31 -nslots: 1 - -subpop_setup: - geodata: model_input/geodata_sample_2pop.csv - mobility: model_input/mobility_sample_2pop.csv - popnodes: population - nodenames: province_name - -compartments: - infection_stage: ["S", "E", "I", "R"] - -seir: - integration: - method: rk4 - dt: 1 / 10 - parameters: - sigma: - value: - distribution: fixed - value: 1 / 4 - gamma: - value: - distribution: fixed - value: 1 / 5 - Ro: - value: - distribution: fixed - value: 3 - transitions: - - source: ["S"] - destination: ["E"] - rate: ["Ro * gamma"] - proportional_to: [["S"],["I"]] - proportion_exponent: ["1","1"] - - source: ["E"] - destination: ["I"] - rate: ["sigma"] - proportional_to: ["E"] - proportion_exponent: ["1"] - - source: ["I"] - destination: ["R"] - rate: ["gamma"] - proportional_to: ["I"] - proportion_exponent: ["1"] - -seeding: - method: FromFile - seeding_file: model_input/seeding_2pop.csv - -modifiers: - scenarios: - - None - settings: - None: - template: Reduce - parameter: r0 - period_start_date: 2020-04-01 - period_end_date: 2020-05-15 - value: - distribution: fixed - value: 0 - diff --git a/examples/tutorials/config_sample_2pop_modifiers.yml b/examples/tutorials/config_sample_2pop_modifiers.yml index 51237d6de..f4ea87832 100644 --- a/examples/tutorials/config_sample_2pop_modifiers.yml +++ b/examples/tutorials/config_sample_2pop_modifiers.yml @@ -61,7 +61,7 @@ seir_modifiers: method: SinglePeriodModifier parameter: Ro period_start_date: 2020-05-01 - period_end_date: 2020-07-01 + period_end_date: 2020-08-31 subpop: "all" value: 0.8 Ro_all: @@ -105,7 +105,7 @@ outcome_modifiers: # assume that due to limitations in testing, initially the case detection probability was lower test_limits: method: SinglePeriodModifier - parameter: incidCase + parameter: incidCase::probability subpop: "all" period_start_date: 2020-02-01 period_end_date: 2020-06-01 diff --git a/examples/tutorials/config_sample_2pop_vaccine_scenarios.yml b/examples/tutorials/config_sample_2pop_vaccine_scenarios.yml new file mode 100644 index 000000000..28e7e9eae --- /dev/null +++ b/examples/tutorials/config_sample_2pop_vaccine_scenarios.yml @@ -0,0 +1,168 @@ +name: sample_2pop +setup_name: minimal +start_date: 2020-02-01 +end_date: 2020-08-31 +nslots: 10 + +subpop_setup: + geodata: model_input/geodata_sample_2pop.csv + mobility: model_input/mobility_sample_2pop.csv + +initial_conditions: + method: SetInitialConditions + initial_conditions_file: model_input/ic_2pop.csv + allow_missing_subpops: TRUE + allow_missing_compartments: TRUE + +compartments: + infection_stage: ["S", "E", "I", "R", "V"] + +seir: + integration: + method: rk4 + dt: 1 + parameters: + sigma: + value: 1 / 4 + gamma: + value: 1 / 5 + Ro: + value: + distribution: truncnorm + mean: 2.5 + sd: 0.1 + a: 1.1 + b: 3 + omega_pess: + value: 0.02 + omega_opt: + value: 0.01 + nu_pess: + value: 0.01 + nu_opt: + value: 0.03 + transitions: + #infections + - source: ["S"] + destination: ["E"] + rate: ["Ro * gamma"] + proportional_to: [["S"],["I"]] + proportion_exponent: ["1","1"] + # progression to infectiousness + - source: ["E"] + destination: ["I"] + rate: ["sigma"] + proportional_to: ["E"] + proportion_exponent: ["1"] + # recovery + - source: ["I"] + destination: ["R"] + rate: ["gamma"] + proportional_to: ["I"] + proportion_exponent: ["1"] + #vaccination (offers complete protection) + - source: ["S"] + destination: ["V"] + rate: ["nu_pess + nu_opt"] + proportional_to: ["S"] + proportion_exponent: ["1"] + # waning of vaccine-induced protection + - source: ["V"] + destination: ["S"] + rate: ["omega_pess + omega_opt"] + proportional_to: ["V"] + proportion_exponent: ["1"] + +seir_modifiers: + scenarios: + - no_vax + - pess_vax + - opt_vax + modifiers: + pess_vax_nu: # turn off nu_opt, only nu_pess left + method: SinglePeriodModifier + parameter: nu_opt + period_start_date: 2020-02-01 + period_end_date: 2020-08-31 + subpop: "all" + value: 0 + pess_vax_wane: # turn off omega_opt, only omega_pess left + method: SinglePeriodModifier + parameter: omega_opt + period_start_date: 2020-02-01 + period_end_date: 2020-08-31 + subpop: "all" + value: 0 + pess_vax: # turn off all vaccination + method: StackedModifier + modifiers: ["pess_vax_nu","pess_vax_wane"] + opt_vax_nu: # turn off nu_pess, only nu_opt left + method: SinglePeriodModifier + parameter: nu_pess + period_start_date: 2020-02-01 + period_end_date: 2020-08-31 + subpop: "all" + value: 0 + opt_vax_wane: # turn off omega_pess, only omega_opt left + method: SinglePeriodModifier + parameter: omega_pess + period_start_date: 2020-02-01 + period_end_date: 2020-08-31 + subpop: "all" + value: 0 + opt_vax: # turn off all vaccination + method: StackedModifier + modifiers: ["opt_vax_nu","opt_vax_wane"] + no_vax: # turn off all vaccination + method: StackedModifier + modifiers: ["pess_vax","opt_vax"] + + + +outcomes: + method: delayframe + outcomes: + incidCase: #incidence of detected cases + source: + incidence: + infection_stage: "I" + probability: + value: + distribution: truncnorm + mean: 0.5 + sd: 0.05 + a: 0.3 + b: 0.7 + delay: + value: 5 + incidHosp: #incidence of hospitalizations + source: + incidence: + infection_stage: "I" + probability: + value: 0.05 + delay: + value: 7 + duration: + value: 10 + name: currHosp # will track number of current hospitalizations (ie prevalence) + incidDeath: #incidence of deaths + source: incidHosp + probability: + value: 0.2 + delay: + value: 14 + +# outcome_modifiers: +# scenarios: +# - test_limits +# modifiers: +# # assume that due to limitations in testing, initially the case detection probability was lower +# test_limits: +# method: SinglePeriodModifier +# parameter: incidCase::probability +# subpop: "all" +# period_start_date: 2020-02-01 +# period_end_date: 2020-06-01 +# value: 0.5 + diff --git a/examples/tutorials/inprepconfigs/config_sample_2pop_2var_interventions.yml b/examples/tutorials/inprepconfigs/config_sample_2pop_2var_interventions.yml deleted file mode 100644 index 4831796f8..000000000 --- a/examples/tutorials/inprepconfigs/config_sample_2pop_2var_interventions.yml +++ /dev/null @@ -1,91 +0,0 @@ -name: sample_2pop -setup_name: minimal -start_date: 2020-01-31 -end_date: 2020-08-31 -data_path: data -nslots: 1 - -spatial_setup: - geodata: geodata_sample_2pop.csv - mobility: mobility_sample_2pop.csv - -seeding: - method: FromFile - seeding_file: data/seeding_2pop_2var.csv - -# not being read at all, not sure why!! -inital_conditions: - method: SetInitialConditions - states_file: data/ic_2pop_2var.csv - allow_missing_nodes: TRUE - -compartments: - infection_stage: ["S", "E", "I", "R"] - strain: ["A","B"] - -seir: - integration: - method: rk4 - dt: 1 / 10 - parameters: - # 1/sigma = duration of exposed period - sigma: - value: - distribution: fixed - value: 1 / 4 - # 1/gamma = duration of infectious period - gamma: - value: - distribution: fixed - value: 1 / 5 - # B = R0 * gamma - Ro: - value: - distribution: uniform - low: 2 - high: 3 - # relative transmission advantage of variant B - fB: - value: - distribution: fixed - value: 1.5 - # immune escape efficacy of variant B (vs variant A) - eB: - value: - distribution: fixed - value: 0.5 - transitions: - # new infections of susceptible individuals - - source: [["S"],["A","B"]] - destination: [["E"],["A","B"]] - rate: [["Ro * gamma"],["1","fB"]] - proportional_to: [ - "source", - [ - [["I"]], - [["A","B"], - ["A","B"]] - ] - ] - proportion_exponent: [["1","1"],["1","1"]] - # progression of exposed infections - - source: [["E"],["A","B"]] - destination: [["I"],["A","B"]] - rate: [["sigma"],["1","1"]] - proportional_to: ["source"] - proportion_exponent: [["1","1"]] - - -interventions: - scenarios: - - None - settings: - None: - template: SinglePeriodModifier - parameter: r0 - period_start_date: 2020-04-01 - period_end_date: 2020-05-15 - value: - distribution: fixed - value: 0 - diff --git a/examples/tutorials/inprepconfigs/config_sample_2pop_interventions_test.yml b/examples/tutorials/inprepconfigs/config_sample_2pop_interventions_test.yml deleted file mode 100644 index 1e12f1d2d..000000000 --- a/examples/tutorials/inprepconfigs/config_sample_2pop_interventions_test.yml +++ /dev/null @@ -1,66 +0,0 @@ -name: sample_2pop -setup_name: minimal -start_date: 2020-01-31 -end_date: 2020-05-31 -nslots: 1 - -subpop_setup: - geodata: model_input/geodata_sample_2pop.csv - mobility: model_input/mobility_sample_2pop.csv - popnodes: population - nodenames: province_name - -compartments: - infection_stage: ["S", "E", "I", "R"] - -seir: - integration: - method: rk4 - dt: 1 / 10 - parameters: - sigma: - value: - distribution: fixed - value: 1 / 4 - gamma: - value: - distribution: fixed - value: 1 / 5 - Ro: - value: - distribution: fixed - value: 3 - transitions: - - source: ["S"] - destination: ["E"] - rate: ["Ro * gamma"] - proportional_to: [["S"],["I"]] - proportion_exponent: ["1","1"] - - source: ["E"] - destination: ["I"] - rate: ["sigma"] - proportional_to: ["E"] - proportion_exponent: ["1"] - - source: ["I"] - destination: ["R"] - rate: ["gamma"] - proportional_to: ["I"] - proportion_exponent: ["1"] - -seeding: - method: FromFile - seeding_file: model_input/seeding_2pop.csv - -modifiers: - scenarios: - - None - settings: - None: - template: Reduce - parameter: r0 - period_start_date: 2020-04-01 - period_end_date: 2020-05-15 - value: - distribution: fixed - value: 0 - diff --git a/examples/tutorials/inprepconfigs/pieces.yaml b/examples/tutorials/inprepconfigs/pieces.yaml deleted file mode 100644 index cacc7d33e..000000000 --- a/examples/tutorials/inprepconfigs/pieces.yaml +++ /dev/null @@ -1,18 +0,0 @@ - - - - # recovery from infection - - source: [["I"],["A","B"]] - destination: [["R"],["A","B"]] - rate: [["gamma"],["1","1"]] - proportional_to: ["source"] - proportion_exponent: ["1"] - - - - # infections of individuals previously infected with variant B with variant A - - source: [["R"],["A"]] - destination: [["E"],["B"]] - rate: [["Ro * gamma"],["fB * eB"]] - proportional_to: ["source"] - proportional_exponent: ["1"] \ No newline at end of file diff --git a/examples/tutorials/model_input/make_test_data.R b/examples/tutorials/model_input/make_test_data.R index 2599964c6..0f90287c8 100644 --- a/examples/tutorials/model_input/make_test_data.R +++ b/examples/tutorials/model_input/make_test_data.R @@ -3,88 +3,152 @@ library(dplyr) library(data.table) library(reticulate) library(readr) +library(stringr) gempyor <- import("gempyor") + +# INPUT FILES AND PARAMETERS ---------- + +input_config = "config_sample_2pop_modifiers.yml" # config to take output from (forward simulation) +input_inference_config = "config_sample_2pop_inference.yml" +input_seir_modifier_scenario = NULL # which SEIR modifier scenario to use. If null, will take the first. Not required if only 1 scenario. +input_outcome_modifier_scenario = NULL # which SEIR modifier scenario to use. If null, will take the first. Not required if only 1 scenario. +input_run_id = NULL # which RUNID to use results from. If null, will take the first. Nor required if only 1 output run. +input_slot = NULL + + # FUNCTIONS --------------------------------------------------------------- -import_model_outputs <- function(scn_dir, outcome, global_opt, final_opt, run_id = opt$run_id, - lim_hosp = c("date", - sapply(1:length(names(config$inference$statistics)), function(i) purrr::flatten(config$inference$statistics[i])$sim_var), - "subpop")){ - # model_output/USA_inference_fake/20231016_204739CEST/hnpi/global/intermediate/000000001.000000001.000000030.20231016_204739CEST.hnpi.parquet - dir_ <- file.path(scn_dir, - paste0(config$name, "_", config$seir_modifiers$scenarios[scenario_num], "_", config$outcome_modifiers$scenarios[scenario_num]), - run_id, - outcome) - subdir_ <- paste0(dir_, "/", - "/", - global_opt, - "/", - final_opt) - subdir_list <- list.files(subdir_) +# Function to read in any model output file type for inference or non-inference run. Taken from model_output_notebook.Rmd +import_model_outputs <- function(scn_run_dir, inference, outcome, global_opt = NULL, final_opt = NULL){ - out_ <- NULL + if(inference){ + + if(is.null(global_opt) | is.null(final_opt)){ + stop("Inference run, must specify global_opt and final_opt") + }else{ + inference_filepath_suffix <-paste0("/",global_opt,"/",final_opt) + print(paste0('Assuming inference run with files in',inference_filepath_suffix)) + } + + }else{ # non inference run + + inference_filepath_suffix <-"" + print('Assuming non-inference run. Ignoring values of global_opt and final_opt if specified') + + } + + subdir <- paste0(scn_run_dir,"/", outcome,"/",inference_filepath_suffix, "/") + #print(subdir) + subdir_list <- list.files(subdir) + #print(subdir_list) + + out <- NULL total <- length(subdir_list) - pb <- txtProgressBar(min=0, max=total, style = 3) print(paste0("Importing ", outcome, " files (n = ", total, "):")) for (i in 1:length(subdir_list)) { - if(any(grepl("parquet", subdir_list))){ - dat <- arrow::read_parquet(paste(subdir_, subdir_list[i], sep = "/")) - } - if(outcome == "hosp"){ - dat <- arrow::read_parquet(paste(subdir_, subdir_list[i], sep = "/")) %>% - select(all_of(lim_hosp)) - } - if(any(grepl("csv", subdir_list))){ - dat <- read.csv(paste(subdir_, subdir_list[i], sep = "/")) + + # read in parquet or csv files + if (any(grepl("parquet", subdir_list))) { + dat <- + arrow::read_parquet(paste(subdir, subdir_list[i], sep = "/")) + } else if (any(grepl("csv", subdir_list))) { + dat <- read.csv(paste(subdir, subdir_list[i], sep = "/")) } - if(final_opt == "final"){ + + if(inference == TRUE & identical(final_opt,"intermediate")){ # if an 'intermediate inference run', filename prefix will include slot, (block), and iteration number + dat$slot <- as.numeric(str_sub(subdir_list[i], start = 1, end = 9)) - } - if(final_opt == "intermediate"){ + dat$block <-as.numeric(str_sub(subdir_list[i], start = 11, end = 19)) + dat$iter <-as.numeric(str_sub(subdir_list[i], start = 21, end = 29)) + + }else{ # if a non-inference run or a 'final' inference run, filename prefix will only contain slot #. Each file is a separate slot + dat$slot <- as.numeric(str_sub(subdir_list[i], start = 1, end = 9)) - dat$block <- as.numeric(str_sub(subdir_list[i], start = 11, end = 19)) + } - out_ <- rbind(out_, dat) - # Increase the amount the progress bar is filled by setting the value to i. - setTxtProgressBar(pb, value = i) + out <- rbind(out, dat) + } - close(pb) - return(out_) + return(out) + } -# Setup files ---------- + +# IMPORT AND PERTURB SIMULATION DATA ------------------------ -# config to take output from -config <- flepicommon::load_config("config_sample_2pop_modifiers.yml") -# config that will run inference -config_inference <- flepicommon::load_config("config_sample_2pop_inference.yml") +config <- flepicommon::load_config(input_config) +config_inference <- flepicommon::load_config(input_inference_config) -res_dir <- "model_output" +# location of output files +res_dir <- file.path(ifelse(is.null(config$model_output_dirname),"model_output", config$model_output_dirname)) +print(res_dir) -# IMPORT OUTCOMES --------------------------------------------------------- -# output location -scenario_num <- 1 +# get the directory of the results for this config + scenario: {config$name}_{seir_modifier_scenario}_{outcome_modifier_scenario} +#setup_prefix <- paste0(config$name,ifelse(is.null(config$seir_modifiers$scenarios),"",paste0("_",input_seir_modifier_scenario)),ifelse(is.null(config$outcome_modifiers$scenarios),"",paste0("_",input_outcome_modifier_scenario))) +# NEEDS TO BE FIXED setup_prefix <- paste0(config$name, - ifelse(is.null(config$seir_modifiers$scenarios),"",paste0("_",config$seir_modifiers$scenarios[scenario_num])), - ifelse(is.null(config$outcome_modifiers$scenarios),"",paste0("_",config$outcome_modifiers$scenarios[scenario_num]))) + ifelse(is.null(config$seir_modifiers$scenarios),"", + ifelse(length(config$seir_modifiers$scenarios)==1,paste0("_",config$seir_modifiers$scenarios), + ifelse(is.null(input_seir_modifier_scenario),paste0("_",config$seir_modifiers$scenarios[1]),paste0("_",input_seir_modifier_scenario)))), + ifelse(is.null(config$outcome_modifiers$scenarios),"", + ifelse(length(config$outcome_modifiers$scenarios)==1,paste0("_",config$outcome_modifiers$scenarios), + ifelse(is.null(input_outcome_modifier_scenario),paste0("_",config$outcome_modifiers$scenarios[1]),paste0("_",input_outcome_modifier_scenario))))) +print(setup_prefix) + +scenario_dir <-file.path(res_dir,setup_prefix) +print(scenario_dir) + +# find all unique run_ids within model_output. Must choose one only for plotting +run_ids <- list.files(scenario_dir) +print(run_ids) + +this_run_id <- ifelse(length(run_ids)==1,run_ids[1],ifelse(is.null(input_run_id),stop(paste0('There are multiple run_ids within ',scenario_dir,'/, you must specify which one to plot the results for in the notebook header using input_run_id')),input_run_id)) +print(this_run_id) + +# entire path to the directory for each type of model output +scenario_run_dir <- file.path(scenario_dir,this_run_id) + +# import outcomes +hosp_outputs <- setDT(import_model_outputs(scenario_run_dir, 0,"hosp")) + +# choose slot +choose_slot <- ifelse(is.null(input_slot),1,input_slot) + +# get outcomes that will be fit during inference, and apply desired aggregation and date range +fit_stats <- names(config_inference$inference$statistics) +outcome_vars_sim <- sapply(1:length(fit_stats), function(j) config_inference$inference$statistics[[j]]$sim_var) #name of model variables +outcome_vars_data <- sapply(1:length(fit_stats), function(j) config_inference$inference$statistics[[j]]$data_var) #name of data variable + +# This is not yet working/implemented so it's not doing this aggregation or reformatting automatically yet + +# df_data <- lapply(subpop_names, function(x) { +# purrr::flatten_df( +# inference::getStats( +# gt_data %>% .[subpop == x,..cols_data], +# "date", +# "data_var", +# stat_list = config$inference$statistics[i], +# start_date = config$start_date_groundtruth, +# end_date = config$end_date_groundtruth +# )) %>% dplyr::mutate(subpop = x) %>% +# mutate(data_var = as.numeric(data_var)) }) %>% dplyr::bind_rows() -res_dir <- file.path(ifelse(is.null(config$model_output_dirname),"model_output", config$model_output_dirname)) -print(res_dir) -results_filelist <- file.path(res_dir, - paste0(config$name, "_", config$seir_modifiers$scenarios[scenario_num], "_", config$outcome_modifiers$scenarios[scenario_num])) -results_filelist <- file.path(results_filelist, list.files(results_filelist)) -model_outputs <- "hosp" -# outcomes variables to choose ------- -# get hosp values -hosp_file <- list.files(file.path(results_filelist,"hosp")) -output_hosp <- setDT(arrow::read_parquet(file.path(results_filelist,"hosp",hosp_file))) +# results_filelist <- file.path(res_dir, +# paste0(config$name, "_", config$seir_modifiers$scenarios[scenario_num], "_", config$outcome_modifiers$scenarios[scenario_num])) +# results_filelist <- file.path(results_filelist, list.files(results_filelist)) +# model_outputs <- "hosp" +# +# # outcomes variables to choose ------- +# # get hosp values +# hosp_file <- list.files(file.path(results_filelist,"hosp")) +# output_hosp <- setDT(arrow::read_parquet(file.path(results_filelist,"hosp",hosp_file))) # filter these outcome variables for desired dates then aggregate to desired level ------- outcome_hosp_ <- output_hosp %>% diff --git a/examples/tutorials/model_output_notebook.Rmd b/examples/tutorials/model_output_notebook.Rmd index 4b017b41a..61ee0fe40 100644 --- a/examples/tutorials/model_output_notebook.Rmd +++ b/examples/tutorials/model_output_notebook.Rmd @@ -14,9 +14,9 @@ params: model_output_dir: model_output #usually model_output, but if results were moved, might be different #results_path: # path to the project folder within which the model_output directory lies. Comment out if current directory #run_id: # name of the run_id to plot results for. Required if multiple run_ids in model_output. Comment out if only one run_id in config - #seir_modifier_scenario: Ro_all + seir_modifier_scenario: Ro_all # name of the scenario to plot results for. Required if multiple scenarios in config. Comment out if no scecnarios in config - #outcome_modifier_scenario: test_limits + outcome_modifier_scenario: test_limits # name of the scenario to plot results for. Required if multiple scenarios in config. Comment out if no scecnarios in config # NOTE: Eventually would want this to be able to plot multipe scenarios or run_ids on the same graphs? continue_on_error: yes @@ -405,7 +405,7 @@ cat("\n\n") These are the outputs for the observational ("outcomes") model, stored in the `hosp` directory, which tracks the incidence and prevalence of individuals with defined observed disease outcomes over time. -## Aggregate outcomes{.tabset} +## Aggregate outcomes - by slot{.tabset} Total number of individuals with each outcome over time, aggregated across other strata (only outcomes without an "_" specifying a stratification are plotted). If more than one simulation (slot) was run, results are plotted for slot `r plot_slot` which has the highest total likelihood over all subpopulations (if inference was run) or was randomly chosen (if no inference). Incidence values are per day. If inference was run, only some of these outcomes may have been used in inference, and the outcomes may have been aggregated to a longer time period (e.g., weeks, months). Inference-specific outcomes, along with the data they were compared to, are shown in later plots. diff --git a/examples/tutorials/model_output_notebook_inference.html b/examples/tutorials/model_output_notebook_inference.html new file mode 100644 index 000000000..3723a18d4 --- /dev/null +++ b/examples/tutorials/model_output_notebook_inference.html @@ -0,0 +1,682 @@ + + + + + + + + + + + + + +Model Output plots + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + +
+ +
+ +

Here is a snapshot 📸 of your model outputs for run ID +20241126_125533EST, from config config_sample_2pop_inference.yml, stored +in model_output.

+
+

1 Infection timeseries: +SEIR model output

+

These are the outputs for the compartmental epidemic model, stored in +the seir directory, which track the prevalence and +incidence of individuals in each model compartment over time.

+

Incidence values are per day.

+
## [1] "Assuming inference run with files in/global/final"
+## [1] "Importing seir files (n = 10):"
+
## [1] "Assuming inference run with files in/global/final"
+## [1] "Importing llik files (n = 10):"
+
+

1.1 All +infections

+

Total number of individuals in each infection state over time +(compartments defined by infection_stage), aggregated +across other strata. Plotted for slot 4 which has the highest total +likelihood over all subpopulations (if inferrence was run) or was +randomly chosen (if no inference).

+
+

1.1.1 +Total

+
+

1.1.1.1 Prevalence

+

+
+
+

1.1.1.2 Cumulative +incidence

+

+
+
+
+

1.1.2 Per +capita

+
+

1.1.2.1 Prevalence

+

+
+
+

1.1.2.2 Cumulative +incidence

+

+
+
+
+
+
+

2 Outcome timeseries: +HOSP model output

+

These are the outputs for the observational (“outcomes”) model, +stored in the hosp directory, which tracks the incidence +and prevalence of individuals with defined observed disease outcomes +over time.

+
+

2.1 +Aggregate outcomes - by slot

+

Total number of individuals with each outcome over time, aggregated +across other strata (only outcomes without an “_” specifying a +stratification are plotted). If more than one simulation (slot) was run, +results are plotted for slot 4 which has the highest total likelihood +over all subpopulations (if inference was run) or was randomly chosen +(if no inference). Incidence values are per day.

+

If inference was run, only some of these outcomes may have been used +in inference, and the outcomes may have been aggregated to a longer time +period (e.g., weeks, months). Inference-specific outcomes, along with +the data they were compared to, are shown in later plots.

+

[1] “Assuming inference run with files in/global/final” [1] +“Importing hosp files (n = 10):”

+
+

2.1.1 +incidCase

+
+

2.1.1.1 Total

+

+
+
+
+

2.1.2 +incidHosp

+
+

2.1.2.1 Total

+

+
+
+
+

2.1.3 +currHosp

+
+

2.1.3.1 Total

+

+
+
+
+

2.1.4 +incidDeath

+
+

2.1.4.1 Total

+

+
+
+
+
+

2.2 +Inference-specific outcomes - by slot

+

The inference method specified that the model be fit to sum_hosp, +with aggregation over period: 1 weeks. Plotted for slot 4 which has the +highest total likelihood over all subpopulations (if inference was run) +or was randomly chosen (if no inference).

+

[1] “Assuming inference run with files in/global/final” [1] +“Importing hosp files (n = 10):”

+
+

2.2.1 +sum_hosp

+

+
+
+
+

2.3 +Inference-specific outcomes - quantiles

+

The inference method specified that the model be fit to sum_hosp, +with aggregation over period: 1 weeks. In total 10 slots ran +successfully.

+
+

2.3.1 +sum_hosp

+

+## Inference-specific outcomes - by likelihood{.tabset}

+

The inference method specified that the model be fit to sum_hosp, +with aggregation over period: 1 weeks. In total 10 slots ran +successfully.

+

This section plots the top 5 and bottom 5 log likelihoods for each +subpopulation.

+
+
+

2.3.2 +sum_hosp

+

+
+
+
+
+

3 Infection model +parameters: SNPI model output

+

These are the parameters that define time-dependent modifications to +the infection model parameters, and are stored in the snpi +directory.

+
+

3.1 Values by slot

+

If inference is run, parameters are the final values at the end of +all MCMC iterations, colored by their likelihoods in a given +subpopulation.
+

+
+
+

3.2 MCMC evolution

+

The accepted value of the parameter for each iteration of the MCMC +algorithm, colored by their likelihood in a given subpopulation. If more +than 5 slots were run, we will plot only the top 5 and bottom 5 log +likelihoods for each subpopulation.

+

+
+
+

3.3 MCMC evolution - +chimeric vs global

+

The accepted value of the parameter for each iteration of the MCMC +algorithm, for both the chimeric and global chain, in a given +subpopulation. Plotted for slot 4 which has the highest total likelihood +over all subpopulations (if inference was run) or was randomly chosen +(if no inference).

+

+
+
+
+

4 Outcome model +parameters: HNPI model output

+

This shows the parameters associated with your outcomes model, for +all subpopulations.

+
+

4.1 Values by slot

+

If inference is run, parameters are the final values at the end of +all MCMC iterations, coloured by their likelihoods in a given +subpopulation.

+

+
+
+

4.2 MCMC evolution

+

The accepted value of the parameter for each iteration of the MCMC +algorithm, colored by their likelihood in a given subpopulation. If more +than 5 slots were run, we will plot only the top 5 and bottom 5 log +likelihoods for each subpopulation.

+

+
+
+

4.3 MCMC evolution - +chimeric vs global

+

The accepted value of the parameter for each iteration of the MCMC +algorithm, for both the chimeric and global chain, in a given +subpopulation. Plotted for slot 4 which has the highest total likelihood +over all subpopulations (if inference was run) or was randomly chosen +(if no inference).

+

+
+
+
+

5 Likelihood: +LLIK model output

+
+

5.1 Acceptance and +likelihood trajectories - All slots and subpopulations

+

This plot shows the binary acceptance decision for each MCMC +iteration (accept), the probability of acceptance for that +acceptance decision (accept_prob), the running average +acceptance probability (accept_avg), and the likelihood. +Chimeric values are subpopulation specific - there are +likely more acceptances as well as acceptances that can increase +subpop-specific likelihood while not changing the total likelihood. +Global acceptances occur for all subpopulations together, +and will always result in the total likelihood increasing, but could +result in decreases in the subpop-specific likelihood.

+
## [1] "Assuming inference run with files in/global/intermediate"
+## [1] "Importing llik files (n = 120):"
+
## [1] "Assuming inference run with files in/chimeric/intermediate"
+## [1] "Importing llik files (n = 120):"
+

+
+
+

5.2 Acceptance and +likelihood trajectories - Single slot

+

+
+
+ + + + +
+ + + + + + + + + + + + + + +