doc

usgs · Apr 16, 2024 · 765bb6c · 765bb6c
1 parent b3bf0fe
commit 765bb6c
Show file tree

Hide file tree

Showing 2 changed files with 12 additions and 9 deletions.
diff --git a/documentation/pestpp_users_guide_v5.2.9.docx → ...mentation/pestpp_users_guide_v5.2.10.docx b/documentation/pestpp_users_guide_v5.2.9.docx → ...mentation/pestpp_users_guide_v5.2.10.docx
diff --git a/documentation/pestpp_users_manual.md b/documentation/pestpp_users_manual.md
@@ -1,13 +1,13 @@
 
  <img src="./media/image1.png" style="width:6.26806in;height:1.68194in" alt="A close up of a purple sign Description automatically generated" />
 
-# <a id='s1' />Version 5.2.9
+# <a id='s1' />Version 5.2.10
 
 <img src="./media/image2.png" style="width:6.26806in;height:3.05972in" />
 
 PEST++ Development Team
 
-Jan 2024
+April 2024
 
 # <a id='s2' />Acknowledgements
 
@@ -70,7 +70,7 @@ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLI
 
 # Table of Contents
 
-- [Version 5.2.9](#s1)
+- [Version 5.2.10](#s1)
 - [Acknowledgements](#s2)
 - [Preface](#s3)
 - [License](#s4)
@@ -3475,6 +3475,8 @@ Optionally, users can forego the use of measurement noise realizations through t
 
 Still another option to cope with the need to define weights and noise that are non-commensurate is to employ the *external* control file section format for observation data (e.g., *\* observation data external*) and in the external observation data file(s), supply the column *standard_deviation* as values of expected noise and supply the *weight* column in the external files as the values necessary to form a balanced composite objective function. In this way, PESTPP-IES will generate realizations of observation noise using the covariance matrix implied by the *standard_deviation* entries and will use the weights during the upgrade calculation process and during objective function calculations.
 
+Given all the issues that have been encountered by users with the way noise and weights interact, starting inversion 5.2.10, if a user does not indicate to use noise relizations either explicitly (through the *ies_no_noise* option) or implicitly (through the ies_observation_ensemble option) or by providing an observation noise representation (through either an *obscov* or by supplying a *standard_deviation* for at least one non-zero weighted observations), PESTPP-IES will automatically default to a no-noise setting; this default choice will be echoed to the rec and stdout at runtime.
+
 ### <a id='s13-1-7' />9.1.7 Regularization
 
 The term “regularization” generally describes numerical devices that are employed to achieve uniqueness of ill-posed inverse problems. Regularization is thus fundamental to the notion of calibration. If regularization is properly implemented, the calibrated parameter field is one of minimized error variance. In most calibration contexts, regularization eliminates from the calibrated parameter field any heterogeneity that is not supported by direct measurements of hydraulic properties or by the calibration dataset.
@@ -3489,7 +3491,7 @@ As with PEST(\_HP) and PESTPP-GLM, the SVD truncation controls (i.e., MAXSING an
 
 ### <a id='s13-1-8' />9.1.8 Base Realization
 
-Optionally (and by default), the parameter ensemble used by PESTPP-IES can include a “base realization”. Parameter values that comprise this realization are those which are listed in the “parameter data” section of the PEST control file which PESTPP-IES reads on commencement of execution. Ideally, this set of parameter values are those of minimum pre-calibration error variance; that is, they comprise the expected values of parameters from an expert knowledge point of view. PESTPP-IES pairs this realization with an observation dataset that has no “manufactured” measurement noise associated with it; this dataset is comprised of measurements that appear in the “observation data” section of the PEST control file.
+Optionally (and by default), the parameter ensemble used by PESTPP-IES can include a “base realization”. Parameter values that comprise this realization are those which are listed in the “parameter data” section of the PEST control file which PESTPP-IES reads on commencement of execution. Ideally, this set of parameter values are those of minimum pre-calibration error variance; that is, they comprise the expected values of parameters from an expert knowledge point of view. PESTPP-IES pairs this realization with an observation dataset that has no “manufactured” measurement noise associated with it; this dataset is comprised of measurements that appear in the “observation data” section of the PEST control file. After completion of PESTPP-IES, this “base” realization is the approximate minimum error variance parameter set and is as close a result to what standard regularized least-squares parameter estimation will yield. If a user is interested in using only one parameter set from a PESTPP-IES analysis, they should always and only ever select the base realization for additional analysis.
 
 By monitoring the fate of the parameter set comprising this base realization, a user can witness the effect that the ensemble smoother process has on a non-random parameter field. Any bias that is introduced to this parameter field, or any incredulous heterogeneity that is introduced to this parameter field, is also presumably introduced to the other random parameter fields which comprise the ensemble. Inspection of this field can aid a modeller in assessing whether, in his/her opinion, the parameter field ensemble that emerges from the ensemble smoother process comprises a legitimate sample of the posterior parameter probability distribution.
 
@@ -3719,7 +3721,7 @@ Where a single parameter field is being estimated in a standard model calibratio
 
 The situation is different, however, when many parameter fields comprising an ensemble are adjusted, such as is done by PESTPP-IES. For an ensemble smoother, the testing of parameter upgrades and the filling of a Jacobian matrix are the same operation. The cost of this operation would become prohibitive if a trial-and-error lambda search procedure accompanied the adjustment of each parameter realization comprising an ensemble. Hence a different strategy must be adopted. This strategy is to undertake lambda testing for only a limited number of parameter fields, and to then use the best lambda that emerges from that testing process when upgrading the rest of them. This limited number of realizations used to evaluate objective function values during upgrade testing is referred to as the “subset”. Options for how to pick the subset are available through the *ies_subset_how()* argument; valid choices for *ies_subset_how()* are “first” (the first *ies_subset_size()* realizations), “last” (the last *ies_subset_size()* realizations), “random” (randomly select *ies_subset_size()* realizations for each iteration), or “phi_based” (select *ies_subset_size()* realizations across the previous composite objective function distribution). Note that if the “base” parameter realization is present, it is always included in the selected subset as the objective function calculated for this realization has special significance.
 
-The number of realizations that comprise the ensemble subset used for lambda testing is set by the value of the *ies_subset_size()* control variable. During each iteration of the ensemble smoother process, values of the Marquardt lambda used for testing realization upgrades are determined by applying a set of multipliers to the best lambda found during the previous iteration. These multipliers are provided through the *ies_lambda_mults()* control variable. A comma separated list of multipliers should be supplied by the user as arguments to this keyword; at least one of these multipliers should be less than 1.0 while, or course, one of them should be greater than 1.0. Line search factors (otherwise known as scale factors) that are applied to each of these lambdas can also be supplied. If so, this is done through the *lambda_scale_fac()* control variable, the same variable that is used by PESTPP-GLM. As for *ies_lambda_mults()*, scale factors should be supplied as a comma-separated list of numbers spanning a range from below 1.0 to greater than 1.0. The total number of model runs required to test parameter upgrades during a given iteration is thus *ies_subset_size()* times the number of multipliers supplied with the *ies_lambda_mults()* control variable times the number of factors supplied with the *lambda_scale_fac()* control variable.
+The number of realizations that comprise the ensemble subset used for lambda testing is set by the value of the *ies_subset_size()* control variable, which is given a default value of -10, which indicates to use 10% of the current number of realizations for subset testing; experience has shown this is a pretty good default value in most cases. During each iteration of the ensemble smoother process, values of the Marquardt lambda used for testing realization upgrades are determined by applying a set of multipliers to the best lambda found during the previous iteration. These multipliers are provided through the *ies_lambda_mults()* control variable. A comma separated list of multipliers should be supplied by the user as arguments to this keyword; at least one of these multipliers should be less than 1.0 while, or course, one of them should be greater than 1.0. Line search factors (otherwise known as scale factors) that are applied to each of these lambdas can also be supplied. If so, this is done through the *lambda_scale_fac()* control variable, the same variable that is used by PESTPP-GLM. As for *ies_lambda_mults()*, scale factors should be supplied as a comma-separated list of numbers spanning a range from below 1.0 to greater than 1.0. The total number of model runs required to test parameter upgrades during a given iteration is thus *ies_subset_size()* times the number of multipliers supplied with the *ies_lambda_mults()* control variable times the number of factors supplied with the *lambda_scale_fac()* control variable.
 
 The value of the Marquardt lambda to use during the first iteration of the ensemble smoother process can be supplied through the *ies_initial_lambda()* control variable. Lambda multipliers supplied through *ies_lambda_mults()* are applied to this value during the first iteration of this process. The PESTPP-IES default value for *ies_initial_lambda()* is $10^{floor\left( \log_{10}\frac{\mu_{Փ}}{2n} \right)}$ where *μ*<sub>Փ</sub> is the mean of objective functions achieved using realizations comprising the initial ensemble, and *n* is the number of non-zero-weighted observations featured in the “observation data” section of the PEST control file.
 
@@ -4165,13 +4167,13 @@ On most occasions of PESTPP-SWP usage, model runs are conducted in parallel. Use
 
 As usual, variables which control how PESTPP-SWP operates must be placed in a PEST control file whose name is supplied on its command line; these variables should appear on lines that begin with the “++” character string.
 
-PESTPP-SWP is directed to a CSV or JCO/JCB input file through the value supplied for its *sweep_parameter_csv_file()* control variable; the type of file is recognized by its extension. A CSV file must have as many columns as there are parameters featured in the PEST control file, plus one extra column on the left. The first column is reserved for the user-supplied realization name. Parameter and realizations names are provided in a JCB or JCO file according to the respective protocols of these files.
+PESTPP-SWP is directed to a CSV, JCO/JCB, or BIN (dense binary) input file through the value supplied for its *sweep_parameter file()* control variable; the type of file is recognized by its extension. A CSV file must have as many columns as there are parameters featured in the PEST control file, plus one extra column on the left. The first column is reserved for the user-supplied realization name. Parameter and realizations names are provided in a JCB or JCO file according to the respective protocols of these files. A dense binary file can be written by pyEMU and is an efficient binary format that allows for chunks of runs to be read (compared to the jacobian formats that require reading the entire file at once).
 
 The number of realizations contained in a user-prepared PESTPP-SWP input file depends on the number of parameter sets for which model runs are required. These realizations can be named according to the user’s taste. PESTPP-SWP carries out one model run for each realization.
 
-Parameter names provided in a CSV of JCO/JCB file must correspond to those that are featured in a PEST control file.
+Parameter names provided in the parameter file must correspond to those that are featured in a PEST control file.
 
-If the *sweep_parameter_csv_file()* control variable does not appear in the PEST control file that is cited on the PESTPP-SWP command line, PESTPP-SWP assumes an input filename of *sweep_in.csv*.
+If the *sweep_parameter_file()* control variable does not appear in the PEST control file that is cited on the PESTPP-SWP command line, PESTPP-SWP assumes an input filename of *sweep_in.csv*.
 
 Note the following.
 
@@ -4183,7 +4185,7 @@ Note the following.
 
 PESTPP-SWP can fill in values for fixed and tied parameters if these are missing from its input file. Actually, it can provide values for other missing parameters as well if the *sweep_forgive()* control variable is set to *true*. These missing values are taken from the PEST control file which is read by PESTPP-SWP.
 
-PESTPP-SWP writes a model output file whose name is provided through the *sweep_output_csv_file()* control variable. If this variable is not provided, PESTPP-SWP employs the name *sweep_out.csv* for its output file. This fiile contains input and output run identifiers, objective function information, and the simulated value for each control file “observation” for each run. Note that, by default, PESTPP-SWP will not calculate the contribution of prior information equations to the total objective function reported in the sweep output file. This can be overridden with the *sweep_include_regul_phi* option.
+PESTPP-SWP writes a model output file whose name is provided through the *sweep_output_file()* control variable. If this variable is not provided, PESTPP-SWP employs the name *sweep_out.csv* for its output file, again the file extension matters: CSV, JCO/JCB, or BIN. This file contains input and output run identifiers, objective function information, and the simulated value for each control file “observation” for each run. Note that, by default, PESTPP-SWP will not calculate the contribution of prior information equations to the total objective function reported in the sweep output file. This can be overridden with the *sweep_include_regul_phi* option.
 
 The control variable *sweep_chunk()* pertains to parallelization of model runs. Runs are done in bundles of size *N*, where *N* is the value supplied for this variable. (A chunk of 500 is the default). This number should be chosen wisely. It should be a multiple of the number of agents that PESTPP-SWP can use for carrying out model runs.
 
@@ -5042,6 +5044,7 @@ Variables discussed in section 5.3.6 that control parallel run management are no
 | *Mou_simplex_factors(0.5,.0.7,0.8)* | double   | Backtracking points to test along each reflected simplex individual.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
 | *Mou_simplex_mutation(false)*       | boolean  | Flag to add guassian mutation to the reflected simplex individuals. Default is false                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
 | *Mou_multigen_population*           | boolean  | Flag to retain and reuse all members across all generations when evaluating dominance and feasibility. This can result in re-enforcing the preference for feasible solutions, which can help with highly nonlinearly constrained problems. However, with a large population and many generations, this option can slow down the execution time of PESTPP-MOU since it must dominance sort a much large number of members. Default is false. Note the option activate automatically when using chances with “all’ chance points when chances are reused across generations. |
+| *Mou_chance_schedule()*             | text     | A two column ascii file that defines when to re-evaluate chances. Generations not listed are set to false. This can be useful to have more granular control regarding chance evaluation.                                                                                                                                                                                                                                                                                                                                                                                   |
 
 Table 13.2. PESTPP-MOU specific control arguments. PESTPP-MOU shares many other control arguments with PESTPP-OPT