Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add global inversion capability + OH optimization (#157)
* Add initial modifications for global inversion capability (1) Added a sample config file in envs/Harvard-Cannon/ for doing a global inversion. The main differences from the standard config file include setting NestedGrid to false and defining lat/lon bounds. (2) Modified make_state_vector_file.py to account for global grids. This involved passing a logical to determine if a regional or global grid is used. If using a global grid, the check for a compatible subdomain and the calculation of the buffer zone can be skipped. (3) Add check for nested grid when determining land cover file name in both src/utilities/util.py and in setup_imi.sh. When using global grids, the two character region ID and period aren't needed. Signed-off-by: Melissa Sulprizio <[email protected]> * Replace "nested" terminology with "regional" throughout code and config files To avoid confusion for IMI users that are not familiar with GEOS-Chem "nested" grids, we now use ther term "regional" in the IMI. This should be more clear to let users know they can run regional or global IMIs. Signed-off-by: Melissa Sulprizio <[email protected]> * Update inversion scripts for GEOS-Chem 14.1.0 output In 14.1.0, the SpeciesConc_ variable name changed to SpeciesConcVV_. This has now been updated in src/inversion_scripts/. Signed-off-by: Melissa Sulprizio <[email protected]> * add bug fixes for 2x2.5 global inversion * Continuous Integration Manifest Generation * bug fixes to work with 2x2.5, parallelization of jacobian and faster calc_sensi * Continuous Integration Manifest Generation * bug fixes for 2x2.5, parallelize jacobian and faster calc_sensi * Add updates to support 4x5 and 0.5x0.625 resolutions and MERRA-2 meteorology Also allow users to specify number of simultaneous jobs to run in the SLURM job array. Setting MaxSimultaneousRuns: -1 will submit all runs simultaneously. Signed-off-by: Melissa Sulprizio <[email protected]> * Add fixes following expanded resolution support updates - Fix path to GEOS-Chem environment files to look within the InversionPath. Print error and exit if the env file can't be found. - Fix logical statements for isRegional in setup.sh and template.sh. - Fix resolution strings in template.sh. - Make config.yml (standard and Harvard-Cannon) files consistent. Signed-off-by: Melissa Sulprizio <[email protected]> * Add minor fixes for global inversion - Add fixes for coarse resolution options to imi_preview.py - Add verbose output to clarify what IMI is doing in certain places - Do not call s3_upload.py if S3Upload is false - Fix default settings in config.yml for Harvard global inversions Signed-off-by: Melissa Sulprizio <[email protected]> * Fix minor conflicts and typos brought in by earlier commits Signed-off-by: Melissa Sulprizio <[email protected]> * Add fixes and updates for global inversion capability - Add flexibility to setup.sh and template.sh that allows users to specify Met in upper or lower case. - Add error check in jacobian.sh to ensure MaxSimultaneousRuns is less than the total number of runs (nElements). - Move loading of GEOS-Chem environment to run_imi.sh instead of repeating in other components of the IMI. - Fix white space inconsistency in statevector.sh. Signed-off-by: Melissa Sulprizio <[email protected]> * Update global inversion configuration file for recent updates Signed-off-by: Melissa Sulprizio <[email protected]> * Update documentation for global IMI capability In imi-config-file.rst: - Added descriptions for new variables added for the global IMI option - Modified some existing descriptions to include expanded options for coarse resolutions - Added placeholders for missing descriptions (e.g. Kalman filter options) In custom-region.rst: - Replaced uses of nested and use regional instead Signed-off-by: Melissa Sulprizio <[email protected]> * Update Harvard-Cannon environment files Signed-off-by: Melissa Sulprizio <[email protected]> * Add more fixes for global inversion capability Signed-off-by: Melissa Sulprizio <[email protected]> * Add option for OH optimization to IMI Additional options to optimize for OH have been added to config.yml. This includes the switch for turning this option on (OptimizeOH), PriorErrorOH, and PerturbValueOH. When enabling this option, an additional run will be added to the jacobian runs. Code has also been added to the inversion scripts for including OH optimization. Signed-off-by: Melissa Sulprizio <[email protected]> * Global branch working changes * Add bug fix in make_state_vector_file.py Code to assign boundary condition grid boxes to the state vector needed to be indented following the addition of "if is_regional:" block. Signed-off-by: Melissa Sulprizio <[email protected]> * add imi config file details to documentation * edits for OH test * Edits for OH * fixed OH scaling * save results * remove large ice fraction from native state vector before file is saved * remove land cover notebook * merge massive cluster bugfix for global state vector * check number of elements in native SV * add ice threshold based on JE edits * move calc superobservation error to utils * Use superobservations to estimate sensitivities in the preview * use number of obs per grid cell to calculate superobservation error * Add fixes for global inversion - Remove hardcoded paths and modified filenames. - Uncommented code blocks that are needed. They were likely commented out for testing. - Make sure Harvard config files are up to date and consistent - Update run_imi.sh to use sapphire partition on Cannon - Remove "if np.abs(lc.lon.values - hd.lon.values).max() != 0" check from make_state_vector_file.py since it causes the IMI to crash for regional grids. Signed-off-by: Melissa Sulprizio <[email protected]> * Add fix to properly create nested grid template run directory The run directories were created for global domains by default, causing the nested grid simulation options in geoschem_config.rc of the template run directory (and all subsequent GEOS-Chem run directories) to be incorrectly set. We now update template.sh to specify nested NA domains if isRegional is true and further modify the lat/lon values from there. Also here, run_imi.sh has been updated to (1) send an email when the IMI has completed (when submitting via slurm) and (2) move the imi_output.log file instead of copying it to the IMI output directory. Signed-off-by: Melissa Sulprizio <[email protected]> * Revert speedup code in calc_sensi.py to avoid dask errors Removed updates to speed up calc_sensi.py originally introduced in 2d54070 because they caused the following error with the existing IMI python environment: ``` Traceback (most recent call last): File "/n/holyscratch01/jacob_lab/msulprizio/Test_IMI_globalMerge_noOH/inversion/calc_sensi.py", line 187, in <module> calc_sensi( File "/n/holyscratch01/jacob_lab/msulprizio/Test_IMI_globalMerge_noOH/inversion/calc_sensi.py", line 123, in calc_sensi pert_data = xr.open_dataset( File "/n/home05/msulprizio/python/mamba/envs/imi_env/lib/python3.9/site-packages/xarray/backends/api.py", line 575, in open_dataset ds = maybe_decode_store(store, chunks) File "/n/home05/msulprizio/python/mamba/envs/imi_env/lib/python3.9/site-packages/xarray/backends/api.py", line 485, in maybe_decode_store from dask.base import tokenize ModuleNotFoundError: No module named 'dask' ```` Signed-off-by: Melissa Sulprizio <[email protected]> * Add several fixes to invert.py Here we revert many changes introduced in commit ce1298e for fixing OH scaling in the inversion, including: (1) Removed addition of 1 to n_elements for OH perturbations. This is already accounted for in the value of n_elements. (2) Remove hardcoded path to output data from jacobian.py (3) Restore original code for K, KT, and delta_y (4) Revert default value of gamma to 0.5 (5) Set perturb_oh prior_err_oh to 0.0 by default and overwrite only when OH optimization is activated Signed-off-by: Melissa Sulprizio <[email protected]> * Remove land cover threshold as default in make_state_vector_file.py A threshold of land > 1% and ice < 10% was added when determining land cover for generation of the the state vector file. This was added for the global inversions, but we have disabled that as the default option here. If users want to impose a threshold, they can do so on their own or we could add this as a future option. Specifying the threshold and/or code setting state vector to nan or 0 resulted in netCDF file that could not be properly read by HEMCO, so those updates have also been reverted. Signed-off-by: Melissa Sulprizio <[email protected]> * Add fixes following review comments 1. Fix comments in config files. Also make default switches consistent throughout the config files. 2. Remove unnecessary #SBATCH lines from posterior.sh and run_inversion.sh. 3. Fix indentation of if statements in run_inversion.sh. 4. Remove .data from ds prompt in make_gridded_posterior.py as it may be prone to errors. 5. Remove src/inversion_scripts/invert_calcJ.py. It is not used. Signed-off-by: Melissa Sulprizio <[email protected]> * Additional fixes for review comments 1. In run_imi.sh switch back to copying imi_output.log file to the InversionPath and add a check at the beginning of the run to see if an old imi_output.log file exists and if so remove it before starting the IMI. 2. Fixed indexing of Sa_diag and xhat in invert.py to properly account for OH and BC optimization options. Also removed bc_idx and OH_idx and simply use scale_factor_idx to denote which elements to apply the scale factor to. 3. Fix scaling weighting of Sa for OH to properly account for the number of elements instead of being hardcoded to 1000. Also include Maasakkers et al. (2019) citation. Signed-off-by: Melissa Sulprizio <[email protected]> --------- Signed-off-by: Melissa Sulprizio <[email protected]> Co-authored-by: Megan He <[email protected]> Co-authored-by: laestrada <[email protected]>
- Loading branch information