Merge pull request #156 from TeaganKing/cleanup_2_v2

Cleanup PR #2 (v2)
NCAR · Nov 22, 2024 · ee4ac42 · ee4ac42
2 parents 71c4ef8 + 6aba58d
commit ee4ac42
Show file tree

Hide file tree

Showing 14 changed files with 239 additions and 120 deletions.
diff --git a/NCARtips.md b/NCARtips.md
@@ -8,25 +8,25 @@ There are two ways to request multiple cores on either casper or derecho.
 Both cases are requesting 12 cores and 120 GB of memory.
 
 
-The recommended approach releases the cores immediately after `cupid-run` finishes:
+The recommended approach releases the cores immediately after `cupid-diagnostics` finishes:
 
 ```
-[login-node] $ conda activate cupid-dev
-(cupid-dev) [login-node] $ qcmd -l select=1:ncpus=12:mem=120GB -- cupid-run
+[login-node] $ conda activate cupid-infrastructure
+(cupid-infrastructure) [login-node] $ qcmd -l select=1:ncpus=12:mem=120GB -- cupid-diagnostics
 ```
 
-Alternatively, you can start an interactive session and remain on the compute nodes after `cupid-run` completes:
+Alternatively, you can start an interactive session and remain on the compute nodes after `cupid-diagnostics` completes:
 
 ```
 [login-node] $ qinteractive -l select=1:ncpus=12:mem=120GB
-[compute-node] $ conda activate cupid-dev
-(cupid-dev) [compute-node] $ cupid-run
+[compute-node] $ conda activate cupid-infrastructure
+(cupid-infrastructure) [compute-node] $ cupid-diagnostics
 ```
 
 Notes:
 1. If you chose to run on derecho, specify the `develop` queue by adding the option `-q develop` to either `qcmd` or `qinteractive`
    (the `develop` queue is a shared resource and you are charged by the core hour rather than the node hour).
-1. `cupid-build` is not computationally expensive, and can be run on a login node for either machine.
+1. `cupid-webpage` is not computationally expensive, and can be run on a login node for either machine.
 
 ## Looking at Output
 

diff --git a/README.md b/README.md
@@ -24,9 +24,9 @@ Then `cd` into the `CUPiD` directory and build the necessary conda environments
 
 ``` bash
 $ cd CUPiD
-$ mamba env create -f environments/dev-environment.yml
-$ conda activate cupid-dev
-$ which cupid-run
+$ mamba env create -f environments/cupid-infrastructure.yml
+$ conda activate cupid-infrastructure
+$ which cupid-diagnostics
 $ mamba env create -f environments/cupid-analysis.yml
 ```
 
@@ -38,14 +38,14 @@ If you do not have `mamba` installed, you can still use `conda`... it will just
 (To see what version of conda you have installed, run `conda --version`.)
 1. If the subdirectories in `externals/` are all empty, run `git submodule update --init` to clone the submodules.
 1. For existing users who cloned `CUPiD` prior to the switch from manage externals to git submodule, we recommend removing `externals/` before checking out main, running `git submodule update --init`, and removing `manage_externals` (if it is still present after `git submodule update --init`).
-1. If `which cupid-run` returned the error `which: no cupid-run in ($PATH)`, then please run the following:
+1. If `which cupid-diagnostics` returned the error `which: no cupid-diagnostics in ($PATH)`, then please run the following:
 
    ``` bash
-   $ conda activate cupid-dev
+   $ conda activate cupid-infrastructure
    $ pip install -e .  # installs cupid
    ```
 
-1. In the `cupid-dev` environment, run `pre-commit install` to configure `git` to automatically run `pre-commit` checks when you try to commit changes from the `cupid-dev` environment; the commit will only proceed if all checks pass. Note that CUPiD uses `pre-commit` to ensure code formatting guidelines are followed, and pull requests will not be accepted if they fail the `pre-commit`-based Github Action.
+1. In the `cupid-infrastructure` environment, run `pre-commit install` to configure `git` to automatically run `pre-commit` checks when you try to commit changes from the `cupid-infrastructure` environment; the commit will only proceed if all checks pass. Note that CUPiD uses `pre-commit` to ensure code formatting guidelines are followed, and pull requests will not be accepted if they fail the `pre-commit`-based Github Action.
 1. If you plan on contributing code to CUPiD,
 whether developing CUPiD itself or providing notebooks for CUPiD to run,
 please see the [Contributor's Guide](https://ncar.github.io/CUPiD/contributors_guide.html).
@@ -56,11 +56,11 @@ CUPiD currently provides an example for generating diagnostics.
 To test the package out, try to run `examples/key-metrics`:
 
 ``` bash
-$ conda activate cupid-dev
+$ conda activate cupid-infrastructure
 $ cd examples/key_metrics
 $ # machine-dependent: request multiple compute cores
-$ cupid-run
-$ cupid-build  # Will build HTML from Jupyter Book
+$ cupid-diagnostics
+$ cupid-webpage  # Will build HTML from Jupyter Book
 ```
 
 After the last step is finished, you can use Jupyter to view generated notebooks in `${CUPID_ROOT}/examples/key-metrics/computed_notebooks`
@@ -74,7 +74,7 @@ Notes:
    (cupid-analysis) $ python -m ipykernel install --user --name=cupid-analysis
    ```
 
-Furthermore, to clear the `computed_notebooks` folder which was generated by the `cupid-run` and `cupid-build` commands, you can run the following command:
+Furthermore, to clear the `computed_notebooks` folder which was generated by the `cupid-diagnostics` and `cupid-webpage` commands, you can run the following command:
 
 ``` bash
 $ cupid-clear
@@ -87,8 +87,8 @@ This will clear the `computed_notebooks` folder which is at the location pointed
 Most of CUPiD's configuration is done via the `config.yml` file, but there are a few command line options as well:
 
 ```bash
-(cupid-dev) $ cupid-run -h
-Usage: cupid-run [OPTIONS] CONFIG_PATH
+(cupid-infrastructure) $ cupid-diagnostics -h
+Usage: cupid-diagnostics [OPTIONS] CONFIG_PATH
 
   Main engine to set up running all the notebooks.
 
@@ -122,8 +122,8 @@ client
 
 #### Specifying components
 
-If no component flags are provided, all component diagnostics listed in `config.yml` will be executed by default. Multiple flags can be used together to select a group of components, for example: `cupid-run -ocn -ice`.
+If no component flags are provided, all component diagnostics listed in `config.yml` will be executed by default. Multiple flags can be used together to select a group of components, for example: `cupid-diagnostics -ocn -ice`.
 
 
 ### Timeseries File Generation
-CUPiD also has the capability to generate single variable timeseries files from history files for all components. To run timeseries, edit the `config.yml` file's timeseries section to fit your preferences, and then run `cupid-run -ts`.
+CUPiD also has the capability to generate single variable timeseries files from history files for all components. To run timeseries, edit the `config.yml` file's timeseries section to fit your preferences, and then run `cupid-timeseries`.
diff --git a/cupid/clear.py b/cupid/clear.py
@@ -57,3 +57,7 @@ def clear(config_path):
     # Delete the "computed_notebooks" folder and all the contents inside of it
     shutil.rmtree(run_dir)
     logger.info(f"All contents in {run_dir} have been cleared.")
+
+
+if __name__ == "__main__":
+    clear()
diff --git a/cupid/build.py → cupid/cupid_webpage.py b/cupid/build.py → cupid/cupid_webpage.py
@@ -27,7 +27,7 @@
 @click.argument("config_path", default="config.yml")
 def build(config_path):
     """
-    Build a Jupyter book based on the TOC in CONFIG_PATH. Called by `cupid-build`.
+    Build a Jupyter book based on the TOC in CONFIG_PATH. Called by `cupid-webpage`.
 
     Args:
         CONFIG_PATH: str, path to configuration file (default config.yml)

diff --git a/cupid/run.py → cupid/run_diagnostics.py b/cupid/run.py → cupid/run_diagnostics.py
@@ -5,13 +5,12 @@
 This script sets up and runs all the specified notebooks and scripts according to the configurations
 provided in the specified YAML configuration file.
 
-Usage: cupid-run [OPTIONS]
+Usage: cupid-diagnostics [OPTIONS]
 
   Main engine to set up running all the notebooks.
 
 Options:
   -s, --serial        Do not use LocalCluster objects
-  -ts, --time-series  Run time series generation scripts prior to diagnostics
   -atm, --atmosphere  Run atmosphere component diagnostics
   -ocn, --ocean       Run ocean component diagnostics
   -lnd, --land        Run land component diagnostics
@@ -29,7 +28,6 @@
 import intake
 import ploomber
 
-import cupid.timeseries
 import cupid.util
 
 CONTEXT_SETTINGS = dict(help_option_names=["-h", "--help"])
@@ -40,7 +38,6 @@
 
 @click.command(context_settings=CONTEXT_SETTINGS)
 @click.option("--serial", "-s", is_flag=True, help="Do not use LocalCluster objects")
-@click.option("--time-series", "-ts", is_flag=True, help="Run time series generation scripts prior to diagnostics")
 # Options to turn components on or off
 @click.option("--atmosphere", "-atm", is_flag=True, help="Run atmosphere component diagnostics")
 @click.option("--ocean", "-ocn", is_flag=True, help="Run ocean component diagnostics")
@@ -49,10 +46,9 @@
 @click.option("--landice", "-glc", is_flag=True, help="Run land ice component diagnostics")
 @click.option("--river-runoff", "-rof", is_flag=True, help="Run river runoff component diagnostics")
 @click.argument("config_path", default="config.yml")
-def run(
+def run_diagnostics(
     config_path,
     serial=False,
-    time_series=False,
     all=False,
     atmosphere=False,
     ocean=False,
@@ -106,89 +102,6 @@ def run(
 
     ####################################################################
 
-    if time_series:
-        timeseries_params = control["timeseries"]
-
-        # general timeseries arguments for all components
-        num_procs = timeseries_params["num_procs"]
-
-        for component, comp_bool in component_options.items():
-            if comp_bool:
-
-                # set time series input and output directory:
-                # -----
-                if isinstance(timeseries_params["case_name"], list):
-                    ts_input_dirs = []
-                    for cname in timeseries_params["case_name"]:
-                        ts_input_dirs.append(global_params["CESM_output_dir"]+"/"+cname+f"/{component}/hist/")
-                else:
-                    ts_input_dirs = [
-                        global_params["CESM_output_dir"] + "/" +
-                        timeseries_params["case_name"] + f"/{component}/hist/",
-                    ]
-
-                if "ts_output_dir" in timeseries_params:
-                    if isinstance(timeseries_params["ts_output_dir"], list):
-                        ts_output_dirs = []
-                        for ts_outdir in timeseries_params["ts_output_dir"]:
-                            ts_output_dirs.append([
-                                os.path.join(
-                                        ts_outdir,
-                                        f"{component}", "proc", "tseries",
-                                ),
-                            ])
-                    else:
-                        ts_output_dirs = [
-                            os.path.join(
-                                    timeseries_params["ts_output_dir"],
-                                    f"{component}", "proc", "tseries",
-                            ),
-                        ]
-                else:
-                    if isinstance(timeseries_params["case_name"], list):
-                        ts_output_dirs = []
-                        for cname in timeseries_params["case_name"]:
-                            ts_output_dirs.append(
-                                os.path.join(
-                                        global_params["CESM_output_dir"],
-                                        cname,
-                                        f"{component}", "proc", "tseries",
-                                ),
-                            )
-                    else:
-                        ts_output_dirs = [
-                            os.path.join(
-                                    global_params["CESM_output_dir"],
-                                    timeseries_params["case_name"],
-                                    f"{component}", "proc", "tseries",
-                            ),
-                        ]
-                # -----
-
-                # fmt: off
-                # pylint: disable=line-too-long
-                cupid.timeseries.create_time_series(
-                    component,
-                    timeseries_params[component]["vars"],
-                    timeseries_params[component]["derive_vars"],
-                    timeseries_params["case_name"],
-                    timeseries_params[component]["hist_str"],
-                    ts_input_dirs,
-                    ts_output_dirs,
-                    # Note that timeseries output will eventually go in
-                    #   /glade/derecho/scratch/${USER}/archive/${CASE}/${component}/proc/tseries/
-                    timeseries_params["ts_done"],
-                    timeseries_params["overwrite_ts"],
-                    timeseries_params[component]["start_years"],
-                    timeseries_params[component]["end_years"],
-                    timeseries_params[component]["level"],
-                    num_procs,
-                    serial,
-                    logger,
-                )
-                # fmt: on
-                # pylint: enable=line-too-long
-
     # Grab paths
 
     run_dir = os.path.realpath(os.path.expanduser(control["data_sources"]["run_dir"]))
@@ -326,3 +239,7 @@ def run(
     dag.build()
 
     return None
+
+
+if __name__ == "__main__":
+    run_diagnostics()