Merge branch 'dev' into scientific_notation_parsing

HopkinsIDD · Dec 6, 2024 · 9c6d3dd · 9c6d3dd
2 parents c97841d + 48c2425
commit 9c6d3dd
Show file tree

Hide file tree

Showing 89 changed files with 5,113 additions and 19,806 deletions.
diff --git a/.github/workflows/conda-env.yml b/.github/workflows/conda-env.yml
@@ -7,12 +7,13 @@ on:
       - build/create_environment_yml.R
       - flepimop/R_packages/*/DESCRIPTION
     branches:
-      - main
+      - dev
   pull_request:
     paths:
       - build/create_environment_yml.R
       - flepimop/R_packages/*/DESCRIPTION
     branches:
+      - dev
       - main
 
 jobs:

diff --git a/.github/workflows/flepicommon-ci.yml b/.github/workflows/flepicommon-ci.yml
@@ -1,16 +1,17 @@
-name: flepicommon-ci
+name: flepicommon CI
 
 on:
   workflow_dispatch:
   push:
     paths:
       - flepimop/R_packages/flepicommon/**/*
     branches:
-      - main
+      - dev
   pull_request:
     paths:
       - flepimop/R_packages/flepicommon/**/*
     branches:
+      - dev
       - main
 
 jobs:

diff --git a/.github/workflows/gempyor-ci.yml b/.github/workflows/gempyor-ci.yml
@@ -1,4 +1,4 @@
-name: gempyor-ci
+name: gempyor CI
 
 on:
   workflow_dispatch:
@@ -7,12 +7,13 @@ on:
       - examples/**/*
       - flepimop/gempyor_pkg/**/*
     branches:
-      - main
+      - dev
   pull_request:
     paths:
       - examples/**/*
       - flepimop/gempyor_pkg/**/*
     branches:
+      - dev
       - main
 
 jobs:

diff --git a/.github/workflows/inference-ci.yml b/.github/workflows/inference-ci.yml
@@ -1,4 +1,4 @@
-name: inference-ci
+name: inference CI
 
 on:
   workflow_dispatch:
@@ -10,11 +10,12 @@ on:
     paths:
       - flepimop/R_packages/inference/**/*
     branches:
-      - main
+      - dev
   pull_request:
     paths:
       - flepimop/R_packages/inference/**/*
     branches:
+      - dev
       - main
 
 jobs:

diff --git a/.gitignore b/.gitignore
@@ -64,7 +64,7 @@ packrat/lib*/
 dist/
 SEIR.egg-info/
 Outcomes.egg-info/
-venv/
+*venv*/
 .venv/
 
 # R package manuals

diff --git a/NEWS.md b/NEWS.md
@@ -0,0 +1,27 @@
+# December 11th, 2024
+
+**Bug Fixes:**
+
+- The HPC init script no longer fails if a conda environment is already active, GH-388.
+- Stdout/stderr from `flepimop-inference-slot` called by `flepimop-inference-main` are piped to a log file via `system2` instead of pipes to support MINGW64, GH-289.
+
+**Dependencies:**
+
+- `click` minimum is now 8.1.7 (latest as of Aug 17, 2023).
+- Added missing `h5py` dependency to `gempyor` requirements and specified `dask` dependency to include `dataframe` optional dependencies, GH-391.
+
+**Deprecates:**
+
+- `gempyor-simulate ...` in favor of `flepimop simulate ...`.
+- Soft deprecated the `-c/--config_files` option (config file(s) are now *arguments* not options).
+
+**New Features:**
+
+- Basic support for multiple config files
+- A `patch` command that takes multiple config files and yields the merged result
+- Converted `gempyor`'s `setup.cfg` to the more modern `pyproject.toml`, GH-391. No user facing changes.
+- Added `flepimop modifiers` subcommand with one action, `config-plot`, for plotting the effects of modifiers on a config, GH-404.
+
+**Removes/Modifies:**
+
+- `gempyor-(seir|outcomes) ...` - these were already no longer supported, just pruning entry points
diff --git a/batch/hpc_init.sh b/batch/hpc_init.sh
@@ -52,6 +52,14 @@ if [ -z "${FLEPI_CONDA}" ]; then
     fi
     echo "Using '$FLEPI_CONDA' for \$FLEPI_CONDA."
 fi
+CURRENT_CONDA_ENV=$( conda info | grep "active environment" | awk -F ':' '{print $2}' | xargs )
+if [ "$CURRENT_CONDA_ENV" = "$FLEPI_CONDA" ]; then
+    echo "Detected the activate conda environment is '$FLEPI_CONDA' already, but will refresh."
+    conda deactivate
+elif [ "$CURRENT_CONDA_ENV" != "None" ]; then
+    echo "Detected an active conda environment '$CURRENT_CONDA_ENV'. This will be deactivated and the '$FLEPI_CONDA' environment wil be activated."
+    conda deactivate
+fi
 conda activate $FLEPI_CONDA
 
 # Check the conda environment is valid

diff --git a/documentation/gitbook/SUMMARY.md b/documentation/gitbook/SUMMARY.md
@@ -48,6 +48,7 @@
 
 * [Before any run](how-to-run/before-any-run.md)
 * [Quick Start Guide](how-to-run/quick-start-guide.md)
+* [Multiple Configuration Files](multi-configs.md)
 * [Advanced run guides](how-to-run/advanced-run-guides/README.md)
   * [Running with Docker locally 🛳](how-to-run/advanced-run-guides/running-with-docker-locally.md)
   * [Running locally in a conda environment 🐍](how-to-run/advanced-run-guides/quick-start-guide-conda.md)
@@ -57,8 +58,9 @@
 * [Useful commands](how-to-run/useful-commands.md)
 * [Tips, tricks, FAQ](how-to-run/tips-tricks-faq.md)
 
-## 🗜️ Development
+## [Development](./development/README.md)
 
+* [Git and GitHub Usage](./development/git-and-github-usage.md)
 * [Guidelines for contributors](development/python-guidelines-for-developers.md)
 
 ## Deprecated pages

diff --git a/documentation/gitbook/development/README.md b/documentation/gitbook/development/README.md
@@ -0,0 +1,3 @@
+# Development
+
+This section covers development/contribution guidelines, including tutorials on how to setup your environment and guides on how we use git/GitHub.
diff --git a/documentation/gitbook/development/git-and-github-usage.md b/documentation/gitbook/development/git-and-github-usage.md
@@ -0,0 +1,27 @@
+# Git and GitHub Usage
+
+We now use a modified gitflow style workflow for working with git and GitHub. For a detailed overview of this topic please refer to [Atlassian's article on Gitflow workflow](https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow).
+
+
+## New Features
+
+New features should be developed in a new branch checked out from the `dev` branch and then merged back into the `dev` branch via a PR on GitHub when ready for review. These feature branches can be deleted after merging into `dev`, unless someone from operations requests that it be kept around. For example, operations may want to merge the feature into their operational branch to get new functionality in advance of a release. By convention feature branches should be prefixed with `feature/<GitHub issue>/`, I.e. `feature/99/cool-new-thing`. Feature branch should also include edits to the GitBook documentation that describe their changes.
+
+
+## Hot Fixes
+
+Hot fixes should be developed in a new branch checked out from the `main` branch and merged back into the `main` branch via a PR on GitHub when ready for review. After successfully merging into `main` the hot fix branch should then be merged into `dev`, making appropriate adjustments to stabilize the feature. The priority for hot fixes is to correct a major issue quickly, so it is okay to delay detailed testing/documentation until merging into `dev`. By convention hot fix branches should be prefixed with `hotfix/`, I.e. `hotfix/important-fix-to-something`, and then converted into a feature branch after merging into main. These do not have to include edits to the GitBook documentation, but if the hotfix conflicts with what is described in the GitBook documentation it's **strongly recommended**.
+
+
+## Creating Releases
+
+Periodically releases will be created by merging the `dev` branch into `main` via a PR on GitHub and creating a new release the `main` branch after merging. These PRs should avoid discussion of individual feature changes, those discussions should be reserved for and handled in the feature PRs. If there is a feature that poses a significant problem in the process of creating a new release those changes should be treated like a new feature. The main purpose of this PR is to:
+
+1. Resolve merge conflicts generated by hot fixes,
+2. Making minor edits to documentation to make it clearer or more cohesive, and
+3. Updating the `NEWS.md` file with a summary of the changes included in the release.
+
+
+## Operations
+
+Operational work should be developed in a new branch checked out from the `main` branch if there are modifications needed to the `flepiMoP` codebase. Pre-released features can be merged directly into this operational branch from the corresponding feature branch as needed via a git merge or rebase not a GitHub PR. After the operational cycle is over, the operations branch **should not** be deleted, instead should be kept around for archival reasons. Operational work needs to move quickly and usually does not involve documenting or testing code and is therefore unsuitable for merging into `dev` or `main` directly. Instead potential features should be extracted from an operations branch into a feature branch using [git cherry-pick](https://git-scm.com/docs/git-cherry-pick) and then modified into an appropriates state for merging into `dev` like a feature branch. By convention operations branch names should be prefixed with `operations/`, I.e. `operations/flu-SMH-2023-24`.
diff --git a/documentation/gitbook/development/python-guidelines-for-developers.md b/documentation/gitbook/development/python-guidelines-for-developers.md
@@ -78,58 +78,3 @@ For those using a Mac or Linux system for development this command is also avail
 ```bash
 cp -f bin/pre-commit .git/hooks/
 ```
-
-#### Structure of the main classes
-
-The code is structured so that each of the main classes **owns** a config segment, and only this class should parse and build the related object. To access this information, other classes first need to build the object.
-
-{% hint style="warning" %}
-Below, this page is still underconstruction
-{% endhint %}
-
-The main classes are:
-
-* `Coordinates:` this is a light class that stores all the coordinates needed by every other class (e.g the time serie
-* `Parameter`
-* `Compartments`
-* `Modifers`
-* `Seeding`,
-* `InitialConditions`
-* a `writeDF`
-* function to plot
-* (TODO: detail pipeline internal API)
-
-### Batch folder
-
-Here are some notes useful to improve the batch submission:
-
-Setup site wide Rprofile.
-
-```
-export R_PROFILE=$COVID_PATH/slurm_batch/Rprofile
-```
-
-> SLURM copies your environment variables by default. You don't need to tell it to set a variable on the command line for sbatch. Just set the variable in your environment before calling sbatch.
-
-> There are two useful environment variables that SLURM sets up when you use job arrays:
-
-> SLURM\_ARRAY\_JOB\_ID, specifies the array's master job ID number. SLURM\_ARRAY\_TASK\_ID, specifies the job array index number. https://help.rc.ufl.edu/doc/Using\_Variables\_in\_SLURM\_Jobs
-
-SLURM does not support using variables in the #SBATCH lines within a job script (for example, #SBATCH -N=$REPS will NOT work). A very limited number of variables are available in the #SBATCH just as %j for JOB ID. However, values passed from the command line have precedence over values defined in the job script. and you could use variables in the command line. For example, you could set the job name and output/error files can be passed on the sbatch command line:
-
-```
-RUNTYPE='test'
-RUNNUMBER=5
-sbatch --job-name=$RUNTYPE.$RUNNUMBER.run --output=$RUNTYPE.$RUNUMBER.txt --export=A=$A,b=$b jobscript.sbatch
-```
-
-However note in this example, the output file doesn't have the job ID which is not available from the command line, only inside the sbatch shell script.
-
-#### File descriptions
-
-launch\_job.py and runner.py for non inference job
-
-inference\_job.py launch a slurm or aws job, where it uses
-
-* \`inference\_runner.sh\` and inference\_copy.sh for aws
-* ;batch/inference\_job.run for slurm
diff --git a/documentation/gitbook/how-to-run/advanced-run-guides/quick-start-guide-conda.md b/documentation/gitbook/how-to-run/advanced-run-guides/quick-start-guide-conda.md
@@ -191,10 +191,10 @@ where:
 
 #### Non-inference run
 
-Stay in the `$DATA_PATH` folder, and run a simulation directly from forward-simulation Python package `gempyor`. To do this, call `gempyor-simulate` providing the name of the configuration file you want to run (ex. `config.yml`). An example config is provided in `flepimop_sample/config_sample_2pop_interventions.yml.`
+Stay in the `$DATA_PATH` folder, and run a simulation directly from forward-simulation Python package `gempyor`. To do this, call `flepimop simulate` providing the name of the configuration file you want to run (ex. `config.yml`). An example config is provided in `flepimop_sample/config_sample_2pop_interventions.yml.`
 
 ```
-gempyor-simulate -c config.yml
+flepimop simulate config.yml
 ```
 
 {% hint style="warning" %}

diff --git a/...mentation/gitbook/how-to-run/advanced-run-guides/running-with-docker-locally.md b/...mentation/gitbook/how-to-run/advanced-run-guides/running-with-docker-locally.md
@@ -192,10 +192,10 @@ flepimop-inference-main -j 1 -n 1 -k 1 -c config.yml
 
 ### Non-inference run
 
-Stay in the `$DATA_PATH` folder, and run a simulation directly from forward-simulation Python package `gempyor,`call `gempyor-simulate` providing the name of the configuration file you want to run (ex. `config.yml` ;
+Stay in the `$DATA_PATH` folder, and run a simulation directly from forward-simulation Python package `gempyor,`call `flepimop simulate` providing the name of the configuration file you want to run (ex. `config.yml`):
 
 ```
-gempyor-simulate -c config.yml
+flepimop simulate config.yml
 ```
 
 {% hint style="warning" %}
@@ -216,7 +216,7 @@ Rscript build/local_install.R
 pip install --no-deps -e flepimop/gempyor_pkg/
 cd $DATA_PATH
 rm -rf model_output
-gempyor-simulate -c config.yml
+flepimop simulate config.yml
 </code></pre>
 
 ## Finishing up

diff --git a/documentation/gitbook/how-to-run/multi-configs.md b/documentation/gitbook/how-to-run/multi-configs.md
@@ -0,0 +1,77 @@
+---
+description: >-
+  Patching together multiple configuration files.
+---
+
+# Using Multiple Configuration Files
+
+## 🧱 Set up
+
+Create a sample project by copying from the examples folder:
+
+```bash
+mkdir myflepimopexample # or wherever
+cd myflepimopexample
+cp -r $FLEPI_PATH/examples/tutorials/* .
+ls
+```
+
+You should see an assortment of yml files as a result of that `ls` command.
+
+## Usage
+
+If you run
+
+```bash
+flepimop simulate config_sample_2pop.yml
+```
+
+you'll get a basic foward simulation of this example model. However, you might also note there are several `*_part.yml` files, corresponding to partial configs. You can `simulate` using the combination of multiple configs with, for example:
+
+```bash
+flepimop simulate config_sample_2pop.yml config_sample_2pop_outcomes_part.yml
+```
+
+if want to see what the combined configuration is, you can use the `patch` command:
+
+```bash
+flepimop patch config_sample_2pop.yml config_sample_2pop_outcomes_part.yml
+```
+
+You may provide an arbitrary number of separate configuration files to combine to create a complete configuration.
+
+## Caveats
+
+At this time, only `simulate` supports multiple configuration files. Also, the patching operation is fairly crude: configuration options override previous ones completely, though with a warning. The files provided from left to right are from lowest priority (i.e. for the first file, only options specified in no other files are used) to highest priority (i.e. for the last file, its options override any other specification).
+
+We are expanding coverage of this capability to other flepimop actions, e.g. inference, and are exploring options for smarter patching.
+
+However, currently there are pitfalls like
+
+```yaml
+# config1
+seir_modifiers:
+  scenarios: ["one", "two"]
+  one:
+    # ...
+  two:
+    # ...
+```
+
+```yaml
+# config2
+seir_modifiers:
+  scenarios: ["one", "three"]
+  one:
+    # ...
+  three:
+    # ...
+```
+
+Then you might expect
+
+```bash
+flepimop simulate config1.yml config2.yml
+```
+
+...to override seir scenario one and add scenario three, but what actually happens is that the entire seir_modifiers from config1 is overriden by config2. Specifying the configuration files in the reverse order would lead to a different outcome (the config1 seir_modifiers overrides config2 settings). If you're doing complex combinations of configuration files, you should use `flepimop patch ...` to ensure you're getting what you expect.
diff --git a/documentation/gitbook/how-to-run/quick-start-guide.md b/documentation/gitbook/how-to-run/quick-start-guide.md
@@ -170,10 +170,10 @@ To get started, let's start with just running a forward simulation (non-inferenc
 
 ### Non-inference run
 
-Stay in the `PROJECT_PATH` folder, and run a simulation directly from forward-simulation Python package gempyor. Call `gempyor-simulate` providing the name of the configuration file you want to run. For example here, we use `config_sample_2pop.yml` from _flepimop\_sample_.
+Stay in the `PROJECT_PATH` folder, and run a simulation directly from forward-simulation Python package gempyor. Call `flepimop simulate` providing the name of the configuration file you want to run. For example here, we use `config_sample_2pop.yml` from _flepimop\_sample_.
 
 ```
-gempyor-simulate -c config_sample_2pop.yml
+flepimop simulate config_sample_2pop.yml
 ```
 
 This will produce a `model_output` folder, which you can look using provided post-processing functions and scripts.
@@ -189,14 +189,14 @@ cd $FLEPI_PATH
 pip install --no-deps -e flepimop/gempyor_pkg/
 cd $PROJECT_PATH
 rm -rf model_output
-gempyor-simulate -c config.yml
+flepimop simulate config.yml
 ```
 
 Note that you only have to re-run the installation steps once each time you update any of the files in the flepimop repository (either by pulling changes made by the developers and stored on Github, or by changing them yourself). If you're just running the same or different configuration file, just repeat the final steps
 
 ```
 rm -rf model_output
-gempyor-simulate -c new_config.yml
+flepimop simulate new_config.yml
 ```
 
 ### Inference run
@@ -257,7 +257,7 @@ Rscript $FLEPI_PATH/flepimop/main_scripts/inference_main.R -c config_inference_n
 
 ## 📈 Examining model output
 
-If your run is successful, you should see your output files in the model\_output folder. The structure of the files in this folder is described in the [Model Output](../gempyor/output-files.md) section. By default, all the output files are .parquet format (a compressed format which can be imported as dataframes using R's arrow package `arrow::read_parquet` or using the free desktop application [Tad ](https://www.tadviewer.com/)for quick viewing). However, you can add the option `--write-csv` to the end of the commands to run the code (e.g., `> gempyor-simulate -c config.yml --write-csv)` to have everything saved as .csv files instead ;
+If your run is successful, you should see your output files in the model\_output folder. The structure of the files in this folder is described in the [Model Output](../gempyor/output-files.md) section. By default, all the output files are .parquet format (a compressed format which can be imported as dataframes using R's arrow package `arrow::read_parquet` or using the free desktop application [Tad ](https://www.tadviewer.com/) for quick viewing). However, you can add the option `--write-csv` to the end of the commands to run the code (e.g.,  `flepimop simulate --write-csv config.yml`) to have everything saved as .csv files instead ;
 
 ## 🪜 Next steps
-Original file line number
+Diff line change
@@ Expand Up / @@ -64,7 +64,7 @@ packrat/lib*/ @@
     dist/
     SEIR.egg-info/
     Outcomes.egg-info/
-    venv/
+    *venv*/
     .venv/
     # R package manuals
@@ Expand Down @@
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		# Development

		This section covers development/contribution guidelines, including tutorials on how to setup your environment and guides on how we use git/GitHub.