Skip to content

Commit

Permalink
Documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
msainsburydale committed Dec 10, 2023
1 parent 386b91e commit bc4cd6d
Show file tree
Hide file tree
Showing 6 changed files with 88 additions and 18 deletions.
10 changes: 8 additions & 2 deletions docs/src/API/core.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ The data should be stored as a `Vector{A}`, where each element of the vector is
## Types of estimators

See also [Architectures and activations functions](@ref) that are often used
when constructing neural estimators, and the convenience constructor [`initialise_estimator`](@ref).
when constructing neural estimators, and the convenience constructor [`initialise_estimator`](@ref).

```@docs
NeuralEstimator
Expand Down Expand Up @@ -59,9 +59,15 @@ assess
Assessment
risk
bias
rmse
```

## Bootstrapping
## Uncertainty quantification

Uncertainty quantification often proceeds through the bootstrap distribution, which is essentially available "for free" when bootstrap data sets can be quickly generated. Alternatively, one may approximate a set of low and high marginal posterior quantiles using a specially constructed neural Bayes estimator, which can then be used to construct credible intervals: see [`IntervalEstimator`](@ref) and its variants.

```@docs
bootstrap
Expand Down
2 changes: 1 addition & 1 deletion src/Architectures.jl
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ element of the vector is associated with one parameter vector (i.e., one set of
independent replicates), and where `A` depends on the form of the data and the
chosen architecture for `ψ`. As a rule of thumb, when the data are stored as an
array, the replicates are stored in the final dimension of the array. (This is
usually the 'batch' dimension, but batching with `DeepSets` is done at the set
usually the 'batch' dimension, but batching with `DeepSet`s is done at the set
level, i.e., sets of replicates are batched together.) For example, with
gridded spatial data and `ψ` a CNN, `A` should be
a 4-dimensional array, with the replicates stored in the 4ᵗʰ dimension.
Expand Down
8 changes: 4 additions & 4 deletions src/Estimators.jl
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ where
- ``g(⋅)`` is a logistic function that maps its input to the prior support.
Note that, in addition to ensuring that the interval remains in the prior support,
this constructions also ensures that the intervals are valid (i.e., it prevents
this construction also ensures that the intervals are valid (i.e., it prevents
quantile crossing, in the sense that the upper bound is always greater than the
lower bound).
Expand Down Expand Up @@ -188,7 +188,7 @@ end
PointIntervalEstimator(arch_point, arch_lower, arch_upper)
PointIntervalEstimator(arch_point, arch_bound)
PointIntervalEstimator(arch)
A neural estimator that jointly produces point estimates, θ̂(Z), where θ̂(Z) is a
A neural estimator that jointly produces point estimates, ``θ̂(Z)``, where ``θ̂(⋅)`` is a
neural point estimator with architecture `arch_point`, and credible intervals constructed as,
```math
Expand Down Expand Up @@ -363,8 +363,8 @@ they see fit using arbitrary `Flux` code; see
[here](https://fluxml.ai/Flux.jl/stable/models/layers/) for `Flux`'s API reference.
# Keyword arguments
- `architecture::String`: for unstructured data, one may use a densely-connected neural network (`"DNN"`); for data collected over a grid, a convolutional neural network (`"CNN"`); and for graphical or irregular spatial data, a graphical neural network (`"GNN"`).
- `d::Integer = 1`: dimension of the response variable (e.g., `d = 1` for univariate processes).
- `architecture::String`: for unstructured multivariate data, one may use a densely-connected neural network (`"DNN"`); for data collected over a grid, a convolutional neural network (`"CNN"`); and for graphical or irregular spatial data, a graphical neural network (`"GNN"`).
- `d::Integer = 1`: for unstructured multivariate data (i.e., when `architecture = "DNN"`), the dimension of the data (e.g., `d = 3` for trivariate data); otherwise, if `architecture ∈ ["CNN", "GNN"]`, the argument `d` controls the number of input channels (e.g., `d = 1` for univariate spatial processes).
- `estimator_type::String = "point"`: the type of estimator; either `"point"` or `"interval"`.
- `depth = 3`: the number of hidden layers; either a single integer or an integer vector of length two specifying the depth of the inner (summary) and outer (inference) network of the DeepSets framework.
- `width = 32`: a single integer or an integer vector of length `sum(depth)` specifying the width (or number of convolutional filters/channels) in each hidden layer.
Expand Down
2 changes: 1 addition & 1 deletion src/NeuralEstimators.jl
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ include("densities.jl")
export train, trainx, subsetdata
include("train.jl")

export assess, Assessment, merge, risk
export assess, Assessment, merge, risk, bias, rmse
include("assess.jl")

export plotrisk, plotdistribution
Expand Down
73 changes: 63 additions & 10 deletions src/assess.jl
Original file line number Diff line number Diff line change
Expand Up @@ -33,32 +33,31 @@ function merge(assessment::Assessment, assessments::Assessment...)
end

@doc raw"""
risk(assessment::Assessment; loss = (x, y) -> abs(x - y), average_over_parameters = true)
risk(assessment::Assessment; ...)
Computes a Monte Carlo approximation of the Bayes risk,
Computes a Monte Carlo approximation of an estimator's Bayes risk,
```math
r_{\Omega}(\hat{\boldsymbol{\theta}}(\cdot))
r(\hat{\boldsymbol{\theta}}(\cdot))
\approx
\frac{1}{K} \sum_{\boldsymbol{\theta} \in \vartheta} \frac{1}{J} \sum_{\boldsymbol{Z} \in \mathcal{Z}_{\boldsymbol{\theta}}} L(\boldsymbol{\theta}, \hat{\boldsymbol{\theta}}(\boldsymbol{Z})).
\frac{1}{K} \sum_{k=1}^K L(\boldsymbol{\theta}^{(k)}, \hat{\boldsymbol{\theta}}(\boldsymbol{Z}^{(k)})),
```
where ``\vartheta`` denotes a set of ``K`` parameter vectors sampled from the
prior ``\Omega(\cdot)`` and, for each ``\boldsymbol{\theta} \in \vartheta``, we
have ``J`` sets of ``m`` mutually independent realisations from the model
collected in ``\mathcal{Z}_{\boldsymbol{\theta}}``.
where ``\{\boldsymbol{\theta}^{(k)} : k = 1, \dots, K\}`` denotes a set of ``K`` parameter vectors sampled from the
prior and, for each ``k``, data ``\boldsymbol{Z}^{(k)}`` are simulated from the statistical model conditional on ``\boldsymbol{\theta}^{(k)}``.
# Keyword arguments
- `loss = (x, y) -> abs(x - y)`: a binary operator defining the loss function (default absolute-error loss).
- `average_over_parameters::Bool = true`: if true (default), the loss is averaged over all parameters; otherwise, the loss is averaged over each parameter separately.
- `average_over_sample_sizes::Bool = true`: if true (default), the loss is averaged over all sample sizes ``m``; otherwise, the loss is averaged over each sample size separately.
"""
function risk(assessment::Assessment;
risk(assessment::Assessment; args...) = risk(assessment.df; args...)

function risk(df::DataFrame;
loss = (x, y) -> abs(x - y),
average_over_parameters::Bool = true,
average_over_sample_sizes::Bool = true)

df = assessment.df
grouping_variables = [:estimator]
if !average_over_parameters push!(grouping_variables, :parameter) end
if !average_over_sample_sizes push!(grouping_variables, :m) end
Expand All @@ -69,6 +68,60 @@ function risk(assessment::Assessment;
return df
end

@doc raw"""
bias(assessment::Assessment; ...)
Computes a Monte Carlo approximation of an estimator's bias,
```math
{\rm{bias}}(\hat{\boldsymbol{\theta}}(\cdot))
\approx
\frac{1}{K} \sum_{k=1}^K \hat{\boldsymbol{\theta}}(\boldsymbol{Z}^{(k)}) - \boldsymbol{\theta}^{(k)},
```
where ``\{\boldsymbol{\theta}^{(k)} : k = 1, \dots, K\}`` denotes a set of ``K`` parameter vectors sampled from the
prior and, for each ``k``, data ``\boldsymbol{Z}^{(k)}`` are simulated from the statistical model conditional on ``\boldsymbol{\theta}^{(k)}``.
This function inherits the keyword arguments of [`risk`](@ref) (excluding the argument `loss`).
"""
bias(assessment::Assessment; args...) = bias(assessment.df; args...)

function bias(df::DataFrame; args...)

df = risk(df; loss = (x, y) -> x - y, args...)

rename!(df, :risk => :bias)

return df
end

@doc raw"""
rmse(assessment::Assessment; ...)
Computes a Monte Carlo approximation of an estimator's root-mean-squared error,
```math
{\rm{rmse}}(\hat{\boldsymbol{\theta}}(\cdot))
\approx
\sqrt{\frac{1}{K} \sum_{k=1}^K (\hat{\boldsymbol{\theta}}(\boldsymbol{Z}^{(k)}) - \boldsymbol{\theta}^{(k)})^2},
```
where ``\{\boldsymbol{\theta}^{(k)} : k = 1, \dots, K\}`` denotes a set of ``K`` parameter vectors sampled from the
prior and, for each ``k``, data ``\boldsymbol{Z}^{(k)}`` are simulated from the statistical model conditional on ``\boldsymbol{\theta}^{(k)}``.
This function inherits the keyword arguments of [`risk`](@ref) (excluding the argument `loss`).
"""
rmse(assessment::Assessment; args...) = rmse(assessment.df; args...)

function rmse(df::DataFrame; args...)

df = risk(df; loss = (x, y) -> (x - y)^2, args...)

df[:, :risk] = sqrt.(df[:, :risk])
rename!(df, :risk => :rmse)

return df
end

# ---- assess() ----

Expand Down
11 changes: 11 additions & 0 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -534,10 +534,21 @@ m = 10 # default sample size

@test typeof(merge(assessment, assessment)) == Assessment
risk(assessment)
risk(assessment, loss = (x, y) -> (x - y)^2)
risk(assessment; average_over_parameters = false)
risk(assessment; average_over_sample_sizes = false)
risk(assessment; average_over_parameters = false, average_over_sample_sizes = false)

bias(assessment)
bias(assessment; average_over_parameters = false)
bias(assessment; average_over_sample_sizes = false)
bias(assessment; average_over_parameters = false, average_over_sample_sizes = false)

rmse(assessment)
rmse(assessment; average_over_parameters = false)
rmse(assessment; average_over_sample_sizes = false)
rmse(assessment; average_over_parameters = false, average_over_sample_sizes = false)

# J == 5 > 1
Z_test = simulator(parameters, m, 5)
assessment = assess([θ̂], parameters, Z_test, use_gpu = use_gpu, verbose = verbose)
Expand Down

0 comments on commit bc4cd6d

Please sign in to comment.