DAPPER is a set of templates for benchmarking the performance of data assimilation (DA) methods. The tests provide experimental support and guidance for new developments in DA. The screenshot below illustrates the default diagnostics.
The typical set-up is a "twin experiment", where you
- specify a
- dynamic model*
- observational model*
- use these to generate a synthetic
- "truth"
- and observations thereof*
- assess how different DA methods perform in estimating the truth, given the above starred (*) items.
DAPPER enables the numerical investigation of DA methods through its variety of typical test cases and statistics. It reproduces numerical results (benchmarks) reported in the literature, and facilitates comparative studies, thus promoting the reliability and relevance of the results. DAPPER is open source, written in Python, and focuses on readability; this promotes the reproduction and dissemination of the underlying science, and makes it easy to adapt and extend. In summary, it is well suited for teaching and fundamental DA research.
In a trade-off with the above advantages, DAPPER makes some sacrifices of efficiency and flexibility (generality). I.e. it is not designed for the assimilation of real data in operational models.
A good place to start is with the scripts example_1/2.py
.
Alternatively, see the tutorials
folder for an intro to DA.
Prerequisite: python3.5+ with
scipy
, matplotlib
, pandas
, and (optionally) tqdm
, tabulate
, and seaborn
.
This can be got (e.g.) from anaconda.
Download, extract, and cd
to DAPPER. To test it, run:
python -i example_1.py
For the tutorial notebooks, you will also need
jupyter
and the markdown
package.
References provided at bottom
Method name | Literature RMSE results reproduced |
---|---|
EnKF 1 | Sakov and Oke (2008) |
EnKF-N | Bocquet (2012), (2015) |
EnKS, EnRTS | Raanes (2016a) |
iEnKS (and -N) | Sakov (2012), Bocquet (2012), (2014) |
LETKF, local & serial EAKF | Bocquet (2011) |
Sqrt. model noise methods | Raanes (2015) |
Particle filter (bootstrap) 2 | Bocquet (2010) |
Optimal/implicit Particle filter 2 | " |
NETF | Tödter (2015), Wiljes (2017) |
Extended KF | Raanes (2016b) |
Optimal interpolation | " |
Climatology | " |
3D-Var |
1: Stochastic, DEnKF (i.e. half-update), ETKF (i.e. sym. sqrt.).
Tuned with inflation and "random, orthogonal rotations".
2: Resampling: multinomial (including systematic/universal and residual).
The particle filter is tuned with "effective-N" monitoring", "regularization/jittering" strength, and more.
Model | Linear? | Phys.dim. | State len | # Lyap≥0 | Thanks to |
---|---|---|---|---|---|
Lin. Advect. | Yes | 1D | 1000 * | 51 | Evensen/Raanes |
Lorenz63 | No | 0D | 3 | 2 | Lorenz/Sakov |
Lorenz84 | No | 0D | 3 | 2 | Lorenz/Raanes |
Lorenz95 | No | 1D | 40 * | 13 | Lorenz/Raanes |
LorenzXY | No | 2x 1D | 256 + 8 * | ≈13 | Lorenz/Raanes |
MAOOAM | No | 2x 1D | 36 | ? | Tondeur/Vannitsem |
Quasi-Geost | No | 2D | 129²≈17k | ≈135 | Sakov |
Barotropic | No | 2D | 256²≈60k | ? | J.Penn/Raanes |
*: straightforward to vary.
Many
- Visualizations
- Diagnostics
- Tools to manage and display experimental settings and stats
Also has:
- Live plotting with on/off toggle
- Confidence interval on times series (e.g. rmse) with
- automatic correction for autocorrelation
- significant digits printing
- CovMat class (input flexibility/overloading, lazy eval) that facilitates the use of non-diagnoal covariance matrices (whether sparse or full)
- Intelligent defaults (e.g. plot duration estimated from autocorrelation, axis limits estimated from percentiles)
- Chronology/Ticker with consistency checks
- Gentle failure system to allow execution to continue if experiment fails.
- Progressbar
- Multivariate random variables: Gaussian, Student-t, Laplace, Uniform, ..., as well as support for custom sampling functions.
- X-platform random number generator (for debugging accross platforms)
- Parallelisation options
- Forecast parallelisation is possible since
the (user-implemented) model has access to the full ensemble
(see
mods/QG/core.py
) - A light-weight alternative (see e.g.
mods/Lorenz95/core.py
): native numpy vectorization (again by having access to full ensemble). - (Independent) experiments can also run in parallel.
Auto-config provided by
utils.py:parallelize()
.
- Forecast parallelisation is possible since
the (user-implemented) model has access to the full ensemble
(see
- Do highly efficient DA on very big models (see discussion in introdution).
- Run different DA methods concurrently (i.e. step-by-step) allowing for live/online (graphic or text) comparison.
- Time-dependent error coviariances and changes in lengths of state/obs (but f and h may otherwise depend on time).
- Non-uniform time sequences only partially supported.
DAPPER is like a set of templates (not a framework); do not hesitate make your own scripts and functions (instead of squeezing everything into standardized configuration files).
Just add it to da_methods.py
, using the others in there as templates.
- Make a new dir:
DAPPER/mods/
your_mod - Add the empty file
__init__.py
- See other examples, e.g.
DAPPER/mods/Lorenz63/sak12.py
- Make sure that the model (and obs operator) supports
2D-array (i.e. ensemble) and 1D-array (single realization) input.
See
Lorenz63
andLorenz95
for typical implementation.
Sorted by approximate project size. DAPPER may be situated somewhere in the middle.
Name | Developers | Purpose (vs. DAPPER) |
---|---|---|
DART | NCAR | Operational and real-world DA |
ERT* | Statoil | Operational (petroleum) history matching |
OpenDA | TU Delft | Operational and real-world DA |
EMPIRE | Reading (Met) | Operational and real-world DA |
SANGOMA | Conglomerate** | Unified code repository researchers |
Verdandi | INRIA | Real-world biophysical DA |
PDAF | Nerger | Real-world and example DA |
PyOSSE | Edinburgh, Reading | Real-world earth-observation DA |
MIKE | DHI | Real-world oceanographic DA. Commercial? |
OAK | Liège | Real-world oceaonagraphic DA |
Siroco | OMP | Real-world oceaonagraphic DA |
FilterPy | R. Labbe | Engineering, general intro to Kalman filter |
DASoftware | Yue Li, Stanford | Matlab, large-scale |
Pomp | U of Michigan | R, general state-estimation |
PyIT | CIPR | Real-world petroleum DA (?) |
Datum* | Raanes | Matlab, personal publications |
EnKF-Matlab* | Sakov | Matlab, personal publications and intro |
EnKF-C | Sakov | C, light-weight EnKF, off-line |
IEnKS code* | Bocquet | Python, personal publications |
pyda | Hickman | Python, personal publications |
*: Has been inspirational in the development of DAPPER.
**: Liege/CNRS/NERSC/Reading/Delft
- Complete QG, LorenzXY
- Reorg file structure
- Make tutorial
Sakov (2008) : Sakov and Oke. "A deterministic formulation of the ensemble Kalman filter: an alternative to ensemble square root filters".
Bocquet (2010) : Bocquet, Pires, and Wu. "Beyond Gaussian statistical modeling in geophysical data assimilation".
Bocquet (2011) : Bocquet. "Ensemble Kalman filtering without the intrinsic need for inflation,".
Sakov (2012) : Sakov, Oliver, and Bertino. "An iterative EnKF for strongly nonlinear systems".
Bocquet (2012) : Bocquet and Sakov. "Combining inflation-free and iterative ensemble Kalman filters for strongly nonlinear systems".
Bocquet (2014) : Bocquet and Sakov. "An iterative ensemble Kalman smoother".
Bocquet (2015) : Bocquet, Raanes, and Hannart. "Expanding the validity of the ensemble Kalman filter without the intrinsic need for inflation".
Tödter (2015) : Tödter and Ahrens. "A second-order exact ensemble square root filter for nonlinear data assimilation".
Raanes (2015) : Raanes, Carrassi, and Bertino. "Extending the square root method to account for model noise in the ensemble Kalman filter".
Raanes (2016a) : Raanes. "On the ensemble Rauch-Tung-Striebel smoother and its equivalence to the ensemble Kalman smoother".
Raanes (2016b) : Raanes. "Improvements to Ensemble Methods for Data Assimilation in the Geosciences".
Wiljes (2017) : Aceved, Wilje and Reich. "Second-order accurate ensemble transform particle filters".
Further references are provided in the algorithm codes.
patrick. n. raanes AT gmail