Skip to content

Commit

Permalink
Merge branch 'dev' into scientific_notation_parsing
Browse files Browse the repository at this point in the history
  • Loading branch information
emprzy authored Dec 6, 2024
2 parents c97841d + 48c2425 commit 9c6d3dd
Show file tree
Hide file tree
Showing 89 changed files with 5,113 additions and 19,806 deletions.
3 changes: 2 additions & 1 deletion .github/workflows/conda-env.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,13 @@ on:
- build/create_environment_yml.R
- flepimop/R_packages/*/DESCRIPTION
branches:
- main
- dev
pull_request:
paths:
- build/create_environment_yml.R
- flepimop/R_packages/*/DESCRIPTION
branches:
- dev
- main

jobs:
Expand Down
5 changes: 3 additions & 2 deletions .github/workflows/flepicommon-ci.yml
Original file line number Diff line number Diff line change
@@ -1,16 +1,17 @@
name: flepicommon-ci
name: flepicommon CI

on:
workflow_dispatch:
push:
paths:
- flepimop/R_packages/flepicommon/**/*
branches:
- main
- dev
pull_request:
paths:
- flepimop/R_packages/flepicommon/**/*
branches:
- dev
- main

jobs:
Expand Down
5 changes: 3 additions & 2 deletions .github/workflows/gempyor-ci.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: gempyor-ci
name: gempyor CI

on:
workflow_dispatch:
Expand All @@ -7,12 +7,13 @@ on:
- examples/**/*
- flepimop/gempyor_pkg/**/*
branches:
- main
- dev
pull_request:
paths:
- examples/**/*
- flepimop/gempyor_pkg/**/*
branches:
- dev
- main

jobs:
Expand Down
5 changes: 3 additions & 2 deletions .github/workflows/inference-ci.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: inference-ci
name: inference CI

on:
workflow_dispatch:
Expand All @@ -10,11 +10,12 @@ on:
paths:
- flepimop/R_packages/inference/**/*
branches:
- main
- dev
pull_request:
paths:
- flepimop/R_packages/inference/**/*
branches:
- dev
- main

jobs:
Expand Down
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ packrat/lib*/
dist/
SEIR.egg-info/
Outcomes.egg-info/
venv/
*venv*/
.venv/

# R package manuals
Expand Down
27 changes: 27 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# December 11th, 2024

**Bug Fixes:**

- The HPC init script no longer fails if a conda environment is already active, GH-388.
- Stdout/stderr from `flepimop-inference-slot` called by `flepimop-inference-main` are piped to a log file via `system2` instead of pipes to support MINGW64, GH-289.

**Dependencies:**

- `click` minimum is now 8.1.7 (latest as of Aug 17, 2023).
- Added missing `h5py` dependency to `gempyor` requirements and specified `dask` dependency to include `dataframe` optional dependencies, GH-391.

**Deprecates:**

- `gempyor-simulate ...` in favor of `flepimop simulate ...`.
- Soft deprecated the `-c/--config_files` option (config file(s) are now *arguments* not options).

**New Features:**

- Basic support for multiple config files
- A `patch` command that takes multiple config files and yields the merged result
- Converted `gempyor`'s `setup.cfg` to the more modern `pyproject.toml`, GH-391. No user facing changes.
- Added `flepimop modifiers` subcommand with one action, `config-plot`, for plotting the effects of modifiers on a config, GH-404.

**Removes/Modifies:**

- `gempyor-(seir|outcomes) ...` - these were already no longer supported, just pruning entry points
8 changes: 8 additions & 0 deletions batch/hpc_init.sh
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,14 @@ if [ -z "${FLEPI_CONDA}" ]; then
fi
echo "Using '$FLEPI_CONDA' for \$FLEPI_CONDA."
fi
CURRENT_CONDA_ENV=$( conda info | grep "active environment" | awk -F ':' '{print $2}' | xargs )
if [ "$CURRENT_CONDA_ENV" = "$FLEPI_CONDA" ]; then
echo "Detected the activate conda environment is '$FLEPI_CONDA' already, but will refresh."
conda deactivate
elif [ "$CURRENT_CONDA_ENV" != "None" ]; then
echo "Detected an active conda environment '$CURRENT_CONDA_ENV'. This will be deactivated and the '$FLEPI_CONDA' environment wil be activated."
conda deactivate
fi
conda activate $FLEPI_CONDA

# Check the conda environment is valid
Expand Down
4 changes: 3 additions & 1 deletion documentation/gitbook/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@

* [Before any run](how-to-run/before-any-run.md)
* [Quick Start Guide](how-to-run/quick-start-guide.md)
* [Multiple Configuration Files](multi-configs.md)
* [Advanced run guides](how-to-run/advanced-run-guides/README.md)
* [Running with Docker locally 🛳](how-to-run/advanced-run-guides/running-with-docker-locally.md)
* [Running locally in a conda environment 🐍](how-to-run/advanced-run-guides/quick-start-guide-conda.md)
Expand All @@ -57,8 +58,9 @@
* [Useful commands](how-to-run/useful-commands.md)
* [Tips, tricks, FAQ](how-to-run/tips-tricks-faq.md)

## 🗜️ Development
## [Development](./development/README.md)

* [Git and GitHub Usage](./development/git-and-github-usage.md)
* [Guidelines for contributors](development/python-guidelines-for-developers.md)

## Deprecated pages
Expand Down
3 changes: 3 additions & 0 deletions documentation/gitbook/development/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Development

This section covers development/contribution guidelines, including tutorials on how to setup your environment and guides on how we use git/GitHub.
27 changes: 27 additions & 0 deletions documentation/gitbook/development/git-and-github-usage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Git and GitHub Usage

We now use a modified gitflow style workflow for working with git and GitHub. For a detailed overview of this topic please refer to [Atlassian's article on Gitflow workflow](https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow).


## New Features

New features should be developed in a new branch checked out from the `dev` branch and then merged back into the `dev` branch via a PR on GitHub when ready for review. These feature branches can be deleted after merging into `dev`, unless someone from operations requests that it be kept around. For example, operations may want to merge the feature into their operational branch to get new functionality in advance of a release. By convention feature branches should be prefixed with `feature/<GitHub issue>/`, I.e. `feature/99/cool-new-thing`. Feature branch should also include edits to the GitBook documentation that describe their changes.


## Hot Fixes

Hot fixes should be developed in a new branch checked out from the `main` branch and merged back into the `main` branch via a PR on GitHub when ready for review. After successfully merging into `main` the hot fix branch should then be merged into `dev`, making appropriate adjustments to stabilize the feature. The priority for hot fixes is to correct a major issue quickly, so it is okay to delay detailed testing/documentation until merging into `dev`. By convention hot fix branches should be prefixed with `hotfix/`, I.e. `hotfix/important-fix-to-something`, and then converted into a feature branch after merging into main. These do not have to include edits to the GitBook documentation, but if the hotfix conflicts with what is described in the GitBook documentation it's **strongly recommended**.


## Creating Releases

Periodically releases will be created by merging the `dev` branch into `main` via a PR on GitHub and creating a new release the `main` branch after merging. These PRs should avoid discussion of individual feature changes, those discussions should be reserved for and handled in the feature PRs. If there is a feature that poses a significant problem in the process of creating a new release those changes should be treated like a new feature. The main purpose of this PR is to:

1. Resolve merge conflicts generated by hot fixes,
2. Making minor edits to documentation to make it clearer or more cohesive, and
3. Updating the `NEWS.md` file with a summary of the changes included in the release.


## Operations

Operational work should be developed in a new branch checked out from the `main` branch if there are modifications needed to the `flepiMoP` codebase. Pre-released features can be merged directly into this operational branch from the corresponding feature branch as needed via a git merge or rebase not a GitHub PR. After the operational cycle is over, the operations branch **should not** be deleted, instead should be kept around for archival reasons. Operational work needs to move quickly and usually does not involve documenting or testing code and is therefore unsuitable for merging into `dev` or `main` directly. Instead potential features should be extracted from an operations branch into a feature branch using [git cherry-pick](https://git-scm.com/docs/git-cherry-pick) and then modified into an appropriates state for merging into `dev` like a feature branch. By convention operations branch names should be prefixed with `operations/`, I.e. `operations/flu-SMH-2023-24`.
Original file line number Diff line number Diff line change
Expand Up @@ -78,58 +78,3 @@ For those using a Mac or Linux system for development this command is also avail
```bash
cp -f bin/pre-commit .git/hooks/
```

#### Structure of the main classes

The code is structured so that each of the main classes **owns** a config segment, and only this class should parse and build the related object. To access this information, other classes first need to build the object.

{% hint style="warning" %}
Below, this page is still underconstruction
{% endhint %}

The main classes are:

* `Coordinates:` this is a light class that stores all the coordinates needed by every other class (e.g the time serie
* `Parameter`
* `Compartments`
* `Modifers`
* `Seeding`,
* `InitialConditions`
* a `writeDF`
* function to plot
* (TODO: detail pipeline internal API)

### Batch folder

Here are some notes useful to improve the batch submission:

Setup site wide Rprofile.

```
export R_PROFILE=$COVID_PATH/slurm_batch/Rprofile
```

> SLURM copies your environment variables by default. You don't need to tell it to set a variable on the command line for sbatch. Just set the variable in your environment before calling sbatch.
> There are two useful environment variables that SLURM sets up when you use job arrays:
> SLURM\_ARRAY\_JOB\_ID, specifies the array's master job ID number. SLURM\_ARRAY\_TASK\_ID, specifies the job array index number. https://help.rc.ufl.edu/doc/Using\_Variables\_in\_SLURM\_Jobs
SLURM does not support using variables in the #SBATCH lines within a job script (for example, #SBATCH -N=$REPS will NOT work). A very limited number of variables are available in the #SBATCH just as %j for JOB ID. However, values passed from the command line have precedence over values defined in the job script. and you could use variables in the command line. For example, you could set the job name and output/error files can be passed on the sbatch command line:

```
RUNTYPE='test'
RUNNUMBER=5
sbatch --job-name=$RUNTYPE.$RUNNUMBER.run --output=$RUNTYPE.$RUNUMBER.txt --export=A=$A,b=$b jobscript.sbatch
```

However note in this example, the output file doesn't have the job ID which is not available from the command line, only inside the sbatch shell script.

#### File descriptions

launch\_job.py and runner.py for non inference job

inference\_job.py launch a slurm or aws job, where it uses

* \`inference\_runner.sh\` and inference\_copy.sh for aws
* ;batch/inference\_job.run for slurm
Original file line number Diff line number Diff line change
Expand Up @@ -191,10 +191,10 @@ where:

#### Non-inference run

Stay in the `$DATA_PATH` folder, and run a simulation directly from forward-simulation Python package `gempyor`. To do this, call `gempyor-simulate` providing the name of the configuration file you want to run (ex. `config.yml`). An example config is provided in `flepimop_sample/config_sample_2pop_interventions.yml.`
Stay in the `$DATA_PATH` folder, and run a simulation directly from forward-simulation Python package `gempyor`. To do this, call `flepimop simulate` providing the name of the configuration file you want to run (ex. `config.yml`). An example config is provided in `flepimop_sample/config_sample_2pop_interventions.yml.`

```
gempyor-simulate -c config.yml
flepimop simulate config.yml
```

{% hint style="warning" %}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -192,10 +192,10 @@ flepimop-inference-main -j 1 -n 1 -k 1 -c config.yml

### Non-inference run

Stay in the `$DATA_PATH` folder, and run a simulation directly from forward-simulation Python package `gempyor,`call `gempyor-simulate` providing the name of the configuration file you want to run (ex. `config.yml` ;
Stay in the `$DATA_PATH` folder, and run a simulation directly from forward-simulation Python package `gempyor,`call `flepimop simulate` providing the name of the configuration file you want to run (ex. `config.yml`):

```
gempyor-simulate -c config.yml
flepimop simulate config.yml
```

{% hint style="warning" %}
Expand All @@ -216,7 +216,7 @@ Rscript build/local_install.R
pip install --no-deps -e flepimop/gempyor_pkg/
cd $DATA_PATH
rm -rf model_output
gempyor-simulate -c config.yml
flepimop simulate config.yml
</code></pre>

## Finishing up
Expand Down
77 changes: 77 additions & 0 deletions documentation/gitbook/how-to-run/multi-configs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
---
description: >-
Patching together multiple configuration files.
---

# Using Multiple Configuration Files

## 🧱 Set up

Create a sample project by copying from the examples folder:

```bash
mkdir myflepimopexample # or wherever
cd myflepimopexample
cp -r $FLEPI_PATH/examples/tutorials/* .
ls
```

You should see an assortment of yml files as a result of that `ls` command.

## Usage

If you run

```bash
flepimop simulate config_sample_2pop.yml
```

you'll get a basic foward simulation of this example model. However, you might also note there are several `*_part.yml` files, corresponding to partial configs. You can `simulate` using the combination of multiple configs with, for example:

```bash
flepimop simulate config_sample_2pop.yml config_sample_2pop_outcomes_part.yml
```

if want to see what the combined configuration is, you can use the `patch` command:

```bash
flepimop patch config_sample_2pop.yml config_sample_2pop_outcomes_part.yml
```

You may provide an arbitrary number of separate configuration files to combine to create a complete configuration.

## Caveats

At this time, only `simulate` supports multiple configuration files. Also, the patching operation is fairly crude: configuration options override previous ones completely, though with a warning. The files provided from left to right are from lowest priority (i.e. for the first file, only options specified in no other files are used) to highest priority (i.e. for the last file, its options override any other specification).

We are expanding coverage of this capability to other flepimop actions, e.g. inference, and are exploring options for smarter patching.

However, currently there are pitfalls like

```yaml
# config1
seir_modifiers:
scenarios: ["one", "two"]
one:
# ...
two:
# ...
```

```yaml
# config2
seir_modifiers:
scenarios: ["one", "three"]
one:
# ...
three:
# ...
```

Then you might expect

```bash
flepimop simulate config1.yml config2.yml
```

...to override seir scenario one and add scenario three, but what actually happens is that the entire seir_modifiers from config1 is overriden by config2. Specifying the configuration files in the reverse order would lead to a different outcome (the config1 seir_modifiers overrides config2 settings). If you're doing complex combinations of configuration files, you should use `flepimop patch ...` to ensure you're getting what you expect.
10 changes: 5 additions & 5 deletions documentation/gitbook/how-to-run/quick-start-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,10 +170,10 @@ To get started, let's start with just running a forward simulation (non-inferenc

### Non-inference run

Stay in the `PROJECT_PATH` folder, and run a simulation directly from forward-simulation Python package gempyor. Call `gempyor-simulate` providing the name of the configuration file you want to run. For example here, we use `config_sample_2pop.yml` from _flepimop\_sample_.
Stay in the `PROJECT_PATH` folder, and run a simulation directly from forward-simulation Python package gempyor. Call `flepimop simulate` providing the name of the configuration file you want to run. For example here, we use `config_sample_2pop.yml` from _flepimop\_sample_.

```
gempyor-simulate -c config_sample_2pop.yml
flepimop simulate config_sample_2pop.yml
```

This will produce a `model_output` folder, which you can look using provided post-processing functions and scripts.
Expand All @@ -189,14 +189,14 @@ cd $FLEPI_PATH
pip install --no-deps -e flepimop/gempyor_pkg/
cd $PROJECT_PATH
rm -rf model_output
gempyor-simulate -c config.yml
flepimop simulate config.yml
```

Note that you only have to re-run the installation steps once each time you update any of the files in the flepimop repository (either by pulling changes made by the developers and stored on Github, or by changing them yourself). If you're just running the same or different configuration file, just repeat the final steps

```
rm -rf model_output
gempyor-simulate -c new_config.yml
flepimop simulate new_config.yml
```

### Inference run
Expand Down Expand Up @@ -257,7 +257,7 @@ Rscript $FLEPI_PATH/flepimop/main_scripts/inference_main.R -c config_inference_n

## 📈 Examining model output

If your run is successful, you should see your output files in the model\_output folder. The structure of the files in this folder is described in the [Model Output](../gempyor/output-files.md) section. By default, all the output files are .parquet format (a compressed format which can be imported as dataframes using R's arrow package `arrow::read_parquet` or using the free desktop application [Tad ](https://www.tadviewer.com/)for quick viewing). However, you can add the option `--write-csv` to the end of the commands to run the code (e.g., `> gempyor-simulate -c config.yml --write-csv)` to have everything saved as .csv files instead ;
If your run is successful, you should see your output files in the model\_output folder. The structure of the files in this folder is described in the [Model Output](../gempyor/output-files.md) section. By default, all the output files are .parquet format (a compressed format which can be imported as dataframes using R's arrow package `arrow::read_parquet` or using the free desktop application [Tad ](https://www.tadviewer.com/) for quick viewing). However, you can add the option `--write-csv` to the end of the commands to run the code (e.g., `flepimop simulate --write-csv config.yml`) to have everything saved as .csv files instead ;

## 🪜 Next steps

Expand Down
Loading

0 comments on commit 9c6d3dd

Please sign in to comment.