Name	Name	Last commit message	Last commit date
parent directory ..
CU-AGE-ST	CU-AGE-ST
Ensemble	Ensemble
Ensemble_LOP	Ensemble_LOP
Ensemble_LOP_untrimmed	Ensemble_LOP_untrimmed
IHME-IHME_COVID_model_deaths_unscaled	IHME-IHME_COVID_model_deaths_unscaled
JHUAPL-Bucky	JHUAPL-Bucky
JHU_IDD-CovidSP	JHU_IDD-CovidSP
Karlen-pypm	Karlen-pypm
MOBS_NEU-GLEAM_COVID	MOBS_NEU-GLEAM_COVID
MOBS_NEU-GLEAM_COVID_OT	MOBS_NEU-GLEAM_COVID_OT
MyTeam-MyModel	MyTeam-MyModel
NCSU-COVSIM	NCSU-COVSIM
NotreDame-FRED	NotreDame-FRED
OliverWyman-Navigator	OliverWyman-Navigator
UF-ABM	UF-ABM
UNCC-hierbin	UNCC-hierbin
USC-SIkJalpha-update	USC-SIkJalpha-update
USC-SIkJalpha	USC-SIkJalpha
UTA-ImmunoSEIRS	UTA-ImmunoSEIRS
UVA-EpiHiper	UVA-EpiHiper
UVA-adaptive	UVA-adaptive
METADATA.md	METADATA.md
README.md	README.md

Data submission instructions

This page is intended to provide teams with all the information they need to submit scenarios.

All scenarios should be submitted directly to the data-processed/ folder. Data in this directory should be added to the repository through a pull request.

Automatic validation is running starting round 13.

Due to file size limitation, the file can be submitted in a .zip or .gz format with the same name as the .csv file provided.

Example

See this file for an illustration of part of a (hypothetical) submission file.

Subdirectory

Each subdirectory within the data-processed/ directory has the format

team-model

where

team is the teamname and
model is the name of your model.

Both team and model should be less than 15 characters, and not include hyphens nor spaces.

Within each subdirectory, there should be a metadata file, a license file (optional), and a set of scenarios.

Metadata

The metadata file name should have the following format

metadata-team-model.txt

where

team is the teamname and
model is the name of your model.

Here are details about the structure of the metadata file. An example hypothetical metadata file has been posted in the data-processed directory.

License (optional)

License information for data sharing and reuse is requested in the metadata, including a link to the license text. If you cannot link to the text of a standard license and have specific license text, include a license file named

LICENSE.txt

Model Results

Each model results file within the subdirectory should have the following name

YYYY-MM-DD-team-model-type.csv

where

YYYY is the 4 digit year,
MM is the 2 digit month,
DD is the 2 digit day,
team is the teamname, and
model is the name of your model.
type is only for the optional submission format for simulation samples (“YYYY-MM-DD-team-model-sample.csv”). Other submission file formats (quantiles) should be named “YYYY-MM-DD-team-model.csv”.

The date YYYY-MM-DD is the model_projection_date.

The team and model in this file must match the team and model in the directory this file is in. Both team and model should be less than 15 characters, alpha-numeric and underscores only, with no spaces or hyphens.

Model results file format

The file must be a comma-separated value (csv) file with the following columns (in any order):

model_projection_date
scenario_name
scenario_id
target
target_end_date
location
type [not required for “sample” file format]
quantile [not required for “sample” file format]
sample [required for “sample” file format]
value

No additional columns are allowed.

Each row in the file is either a point or quantile scenario for a location on a particular date for a particular target.

For the "sample" format, only the "incident" targets are required and no quantiles and types information should be included in the file.

If the size of the file is larger than 100MB, it can be submitted in a .zip or .gz format.

`model_projection_date`

Values in the model_projection_date column must be a date in the format

YYYY-MM-DD

Model projections will have an associated model_projection_date that corresponds to the day the projection was made. Starting round 12, the validation will test that the "model_projection_date" and date in the filename match and should correspond to the start date for scenarios (first date of simulated transmission/outcomes).

For week-ahead model projections with model_projection_date of Sunday or Monday of EW12, a 1 week ahead projection corresponds to EW12 and should have target_end_date of the Saturday of EW12. For week-ahead projections with model_projection_date of Tuesday through Saturday of EW12, a 1 week ahead projection corresponds to EW13 and should have target_end_date of the Saturday of EW13. A week-ahead projection should represent the total number of incident deaths or hospitalizations within a given epiweek (from Sunday through Saturday, inclusive) or the cumulative number of deaths reported on the Saturday of a given epiweek. Model projection dates in the COVID-19 Scenario Modeling Hub are equivelent to the forecast dates in the COVID-19 Forecast Hub.

`scenario_name`

The standard scenario names should be used as given in the scenario description in the main Readme. Scenario names only include characters and no spaces, e.g., optimistic.

`scenario_id`

The standard scenario id should be used as given in in the scenario description in the main Readme. Scenario id's include a captitalized letter and date as YYYY-MM-DD on which the scenario was last modified by the project coordinators, e.g., A-2020-12-22.

`target`

We are requesting model projections for a minimum of 13 and maximum of 26 weeks into the future.

The requested targets are:

weekly incident deaths
cumulative deaths
weekly incident cases
cumulative incident cases
weekly incident hospitalizations
cumulative incident hospitalizations

Optional target:

weekly incident infections
weekly proportion of cases caused by variant X (mean only) [Round 14 and Round 15 specific]

Values in the target column must be a character (string) and be one of the following specific targets:

"N wk ahead inc death" where N is a number between 1 and 26 (or 12 or 40 or 52, depending on the round)
"N wk ahead cum death" where N is a number between 1 and 26 (or 12 or 40 or 52, depending on the round)
"N wk ahead inc case" where N is a number between 1 and 26 (or 12 or 40 or 52, depending on the round)
"N wk ahead cum case" where N is a number between 1 and 26 (or 12 or 40 or 52, depending on the round)
"N wk ahead inc hosp" where N is a number between 1 and 26 (or 12 or 40 or 52, depending on the round)
"N wk ahead cum hosp" where N is a number between 1 and 26 (or 12 or 40 or 52, depending on the round)
"N wk ahead inc inf" where N is a number between 1 and 26 (or 12 or 40 or 52, depending on the round)
"N wk ahead prop X" where N is a number between 1 and 26 (or 12 or 40 or 52, depending on the round) [Round 14 and Round 15 specific]

For week-ahead scenarios, we will use the specification of epidemiological weeks (EWs) defined by the US CDC which run Sunday through Saturday.

There are standard software packages to convert from dates to epidemic weeks and vice versa. E.g. MMWRweek for R and pymmwr and epiweeks for python.

For week-ahead scenarios with model_projection_date of Sunday or Monday of EW12, a 1 week ahead scenario corresponds to EW12 and should have target_end_date of the Saturday of EW12. For week-ahead scenarios with model_projection_date of Tuesday through Saturday of EW12, a 1 week ahead scenario corresponds to EW13 and should have target_end_date of the Saturday of EW13.

N wk ahead inc death

This target is the incident (weekly) number of deaths predicted by the model during the week that is N weeks after model_projection_date.

A week-ahead scenario should represent the total number of new deaths reported during a given epiweek (from Sunday through Saturday, inclusive).

Predictions for this target will be evaluated compared to the number of new reported cases, as recorded by the JHU CSSE group as distributed by the COVIDcast Epidata API.

N wk ahead cum death

This target is the cumulative number of deaths predicted by the model up to and including N weeks after model_projection_date.

A week-ahead scenario should represent the cumulative number of deaths reported on the Saturday of a given epiweek.

Predictions for this target will be evaluated compared to the cumulative of the number of new reported deaths, as recorded by the JHU CSSE group as distributed by the COVIDcast Epidata API.

N wk ahead inc case

This target is the incident (weekly) number of cases predicted by the model during the week that is N weeks after model_projection_date.

A week-ahead scenario should represent the total number of new cases reported during a given epiweek (from Sunday through Saturday, inclusive).

Predictions for this target will be evaluated compared to the number of new reported cases, as recorded by the JHU CSSE group as distributed by the COVIDcast Epidata API.

N wk ahead cum case

This target is the cumulative number of incident cases predicted by the model up to and including N weeks after model_projection_date.

A week-ahead scenario should represent the cumulative number of cases reported on the Saturday of a given epiweek.

Predictions for this target will be evaluated compared to the cumulative of the number of new reported cases, as recorded by the JHU CSSE group as distributed by the COVIDcast Epidata API.

N wk ahead inc hosp

This target is the incident (weekly) number of hospitalized cases predicted by the model during the week that is N weeks after model_projection_date.

A week-ahead scenario should represent the total number of new hospitalized cases reported during a given epiweek (from Sunday through Saturday, inclusive).

Predictions for this target will be evaluated compared to the number of new hospitalized cases, as reported by the HHS and distributed by the COVIDcast Epidata API.

N wk ahead cum hosp

This target is the cumulative number of incident (weekly) number of hospitalized cases predicted by the model during the week that is N weeks after model_projection_date.

A week-ahead scenario should represent the cumulative number of hospitalized cases reported on the Saturday of a given epiweek.

Predictions for this target will be evaluated compared to the cumulative of the number of new hospitalized cases, as reported by the HHS and distributed by the COVIDcast Epidata API.

N wk ahead inc inf

This target is the number of incident (weekly) infections predicted by the model during the week that is N weeks after model_projection_date.

A week-ahead scenario should represent the total number of new infections occurring within a given epiweek (from Sunday through Saturday, inclusive).

Projections of infections will be used to compare outputs between models but will not be evaluated against observations.
Projections of infections are optional.

N wk ahead prop X [Round 14 and Round 15 specific]

This target is the proportion of incident (weekly) cases caused by variant X among all COVID19 cases, as predicted by the model during the week that is N weeks after model_projection_date.

A week-ahead scenario should represent the proportion of variant X cases occurring within a given epiweek (from Sunday through Saturday, inclusive).

Projections of variant X proportion will be used to compare outputs between models but will not be evaluated against observations.
Further, we do not expect a full distribution of quantiles, only mean estimates.
Projections of proportion of variant X are optional

`target_end_date`

Values in the target_end_date column must be a date in the format

YYYY-MM-DD

This is the date for the scenario target. For "# day" targets, target_end_date will be # days after forecast_date. For "# wk" targets, target_end_date will be the Saturday at the end of the week time period.

`location`

Values in the location column must be one of the "locations" in this FIPS numeric code file which includes numeric FIPS codes for U.S. states, counties, territories, and districts as well as "US" for national scenarios.

Please note that when writing FIPS codes, they should be written in as a character string to preserve any leading zeroes.

`type`

Values in the type column are either

"point" or
"quantile".

This value indicates whether that row corresponds to a point scenario or a quantile scenario. Point scenarios are used in visualization while quantile scenarios are used in visualization and in ensemble construction.

Scenarios must include exactly 1 "point" scenario for every location-target pair.

`quantile`

Values in the quantile column are either "NA" (if type is "point") or a quantile in the format

0.###

For quantile scenarios, this value indicates the quantile for the value in this row.

Teams should provide the following 23 quantiles:

c(0.01, 0.025, seq(0.05, 0.95, by = 0.05), 0.975, 0.99)

##  [1] 0.010 0.025 0.050 0.100 0.150 0.200 0.250 0.300 0.350 0.400 0.450 0.500 0.550 0.600 0.650 0.700 0.750
## [18] 0.800 0.850 0.900 0.950 0.975 0.990

`sample`

For the optional simulation samples format only. Values in the sample column are numeric between 1 and 100 indicating an id sample number.

`value`

Values in the value column are non-negative numbers indicating the "point" or "quantile" prediction for this row. For a "point" prediction, value is simply the value of that point prediction for the target and location associated with that row. For a "quantile" prediction, value is the inverse of the cumulative distribution function (CDF) for the target, location, and quantile associated with that row.

Scenario validation (development in progress)

To ensure proper data formatting, pull requests for new data in data-processed/ will be automatically run.

For the first round of submissions, the autmoated pull requests may not work yet.

Pull request scenario validation (development in progress)

When a pull request is submitted, the data are validated by running the tests in validation.R. The intent for these tests are to validate the requirements above and specifically enumerated on the wiki. Please let us know if the wiki is inaccurate.

Run checks locally

To run these checks locally rather than waiting for the results from a pull request, follow these instructions (section File Checks Run).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data-processed

data-processed

README.md

Data submission instructions

Example

Subdirectory

Metadata

License (optional)

Model Results

Model results file format

`model_projection_date`

`scenario_name`

`scenario_id`

`target`

N wk ahead inc death

N wk ahead cum death

N wk ahead inc case

N wk ahead cum case

N wk ahead inc hosp

N wk ahead cum hosp

N wk ahead inc inf

N wk ahead prop X [Round 14 and Round 15 specific]

`target_end_date`

`location`

`type`

`quantile`

`sample`

`value`

Scenario validation (development in progress)

Pull request scenario validation (development in progress)

Run checks locally

Files

data-processed

Directory actions

More options

Directory actions

More options

Latest commit

History

data-processed

Folders and files

parent directory

README.md

Data submission instructions

Example

Subdirectory

Metadata

License (optional)

Model Results

Model results file format

model_projection_date

scenario_name

scenario_id

target

N wk ahead inc death

N wk ahead cum death

N wk ahead inc case

N wk ahead cum case

N wk ahead inc hosp

N wk ahead cum hosp

N wk ahead inc inf

N wk ahead prop X [Round 14 and Round 15 specific]

target_end_date

location

type

quantile

sample

value

Scenario validation (development in progress)

Pull request scenario validation (development in progress)

Run checks locally

`model_projection_date`

`scenario_name`

`scenario_id`

`target`

`target_end_date`

`location`

`type`

`quantile`

`sample`

`value`