07-bayes_nb_bernoulli_glm.Rmd

# Bayesian Bernoulli GLM {#bern-glm}

A Bernoulli distribution is a discrete distribution for dealing with data with only two possible outcomes, such as success and failure or presence and absence. The Bernoulli GLM is for strictly binary data and is sometimes called a logistic GLM (or just "logistic regression"). In ecology, a Bernoulli GLM is a particularly useful statistical tool for modelling presence/absence data.

## Common cuckoo parasitism of great reed warbler nests {#bern-cc}

The common cuckoo (_Cuculus canorus_) is an avian brood parasite, laying its eggs in the nests of its host species which subsequently rear the parasitic cuckoo chick to independence. The common cuckoo exploits a range of hosts, with each individual female cuckoo a host specialist, exploiting a single or small group of host species. An important host is the great reed warbler (_Acrocephalus arundinaceus_), which nests in reedbeds associated with waterbodies across lowland Europe. Because brood parasitism imposes a severe fitness cost on hosts, hosts have evolved defensive adaptations, including cryptic nesting behaviour as well as aggression towards cuckoos; reed warblers are capable of killing cuckoos that attempt to lay in their nests [@_ulc_2020].

Here we fit a Bernoulli GLM to data on the occurrence of cuckoo parasitism in reed warbler nests in reedbeds associated with a marshland area in the South Moravian region of the Czech Republic. Data were collected for 18 nests during a single breeding season. Each nest was monitored over the entire breeding season, with a record made of whether or not the nest suffered cuckoo brood parasitism. Data were also collected for each nest on the distance to the nearest tree, which serve as vantage points for female cuckoos to observe the location of reed warbler nests. An additional habitat variable measured was the distance of each nest to open water, which provides a measure of how inaccessible a nest was in a given reedbed. Finally, the aggressive response of each pair of reed warblers to a dummy cuckoo placed 1 m from their nest was measured, with the response scored as either a low-level aggressive response (no reaction or distress calling) or high-level aggression (mobbing and attacking).

In addition to these data, a set of pilot data, comprising just 7 records, were collected in the year preceding the main study using identical methods. These data were used to assign priors to the main model.

*__Import data__*

```{r ch7-libraries, echo=FALSE, warning=FALSE, message=FALSE}
library(lattice)  
library(ggplot2)
library(GGally)
library(tidyverse)
library(mgcv)
library(lme4)
library(car)
library(devtools)
library(ggpubr)
library(qqplotr)
library(geiger)
library(gridExtra)
library(rlang)
library(INLA)
library(brinla)
library(inlatools)
```

Data for cuckoo parasitism are saved in a comma-separated values (CSV) file `cuckoo.csv` and are imported  into a dataframe in R using:

`cc <- read_csv("cuckoo.csv")`

```{r ch7-csv-cc, echo=FALSE, warning=FALSE, message=FALSE}
cc <- read_csv("cuckoo.csv")
```

Start by inspecting the dataframe:

`str(cc)`
```{r ch7-str-cc, comment = "", echo=FALSE, warning=FALSE, message=FALSE}
str(cc, vec.len=2)
```

The dataframe comprises `r nrow(cc)` observations of `r ncol(cc)` variables. Each row in the dataframe represents a unique reed warbler nest (`nest`). There are two continuous ecological variables: `water` is the distance (in m) of each nest to open water, which represents the edge of the reedbed, and `tree` is the distance (in m) to the nearest tree that could act as a vantage point for female cuckoos to observe the reed warbler nest. There is a single categorical variable `aggress` with two levels; `high` and `low`, representing the level of the aggressive response of each pair of reed warblers to a dummy cuckoo. The binomial variable `egg`, is the response variable and indicates whether the nest escaped cuckoo parasitism (0) or received at least one cuckoo egg (1).


## Steps in fitting a Bayesian GLM {#bern-glm-steps}

We will following the 9 steps to fitting a Bayesian GLM, detailed in Chapter \@ref(fit-steps).

_1. State the question_

_2. Perform data exploration_

_3. Select a statistical model_

_4. Specify and justify a prior distribution on parameters_

_5. Fit the model_

_6. Obtain the posterior distribution_

_7. Conduct model checks_

_8. Interpret and present model output_

_9. Visualise the results_

### State the question {#cc-question}

The aim of the study was to identify whether the three explanatory variables; `water`, `tree` and `aggress`, contributed to the probability that a great reed warbler nest would be parasitised by common cuckoos. The prediction was that probability of parasitism would be negatively associated with distance of a nest to open water and the nearest tree. A greater likelihood of parasitism was predicted in the case that the pair of nesting reed warblers showed a low-level aggressive response to a dummy cuckoo.

### Data exploration {#cc-eda}

A data exploration is critical to identifying any potential problems or unusual patterns in the data. First check for missing data.

`colSums(is.na(cc))`

```{r ch7-nas, comment = "", echo=FALSE, warning=FALSE, message=FALSE}
colSums(is.na(cc))
```

There are no missing data.

#### Outliers {#cc-outliers}

Outliers in the data can identified visually using Cleveland dotplots, R code is available in the R script associated with this chapter:

(ref:ch7-dotplot) **Dotplots of distance to open water (`water`) and a tree (`tree`) of reed warbler nests. Data are arranged by the order they appear in the dataframe.**

```{r ch7-dotplot, fig.cap='(ref:ch7-dotplot)', fig.align='center', fig.dim=c(6, 4), cache = TRUE, message = FALSE, echo=FALSE, warning=FALSE}

cc <- cc %>%
  mutate(order = seq(1:nrow(cc)))

# Set preferred theme
My_theme <- theme(axis.text.y = element_blank(),
                  axis.ticks.y = element_blank(),
                  axis.ticks.x=element_blank(),
                  panel.background = element_blank(),
                  panel.border = element_rect(fill = NA, size = 1),
                  strip.background = element_rect(fill = "white", 
                                                  color = "white", size = 1),
                  text = element_text(size = 14),
                  panel.grid.major = element_line(colour = "white", size = 0.1),
                  panel.grid.minor = element_line(colour = "white", size = 0.1))

#Write function
multi_dotplot <- function(filename, Xvar, Yvar){
  filename %>% 
    ggplot(aes(x = {{Xvar}})) +
    geom_point(aes(y = {{Yvar}})) +
    theme_bw() +
    My_theme +
    coord_flip() +
    labs(x = "Order of Data")
}

#CHOOSE VARIABLE FOR EACH PLOT AND APPLY FUNCTION
p1 <- multi_dotplot(cc, order, water) 
p2 <- multi_dotplot(cc, order, tree) 


#CREATE GRID
grid.arrange(p1, p2, nrow = 1)

```

There are no obvious outliers in the Fig. \@ref(fig:ch7-dotplot). 

#### Distribution of the dependent variable {#bern-dist}

The distribution of the dependent variable is binomial; 0s and 1s. We can examine the balance of the response variable:

`table(cc$egg)`

```{r ch7-egg-bal, comment = "", echo=FALSE, warning=FALSE, message=FALSE}
table(cc$egg)
```

Which shows that a total of 8 reed warbler nests escaped parasitism, while 10 received at least 1 cuckoo egg.

#### Balance of categorical variables {#bern-balance}

The balance of the categorical variable `aggress` (the aggressive response of each pair of reed warblers to a dummy cuckoo) is:

`table(cc$aggress)`

```{r ch7-aggress-bal, comment = "", echo=FALSE, warning=FALSE, message=FALSE}
table(cc$aggress)
```

The balance of this variable is perfect.

#### Multicollinearity among covariates {#bern-collin}

If covariates in a model are correlated, then there is a risk that the model may produce unstable parameter estimates with inflated standard errors. We visualise the relationships among model covariates using the `ggpairs` command from the `GGally` library:

`cc %>% ggpairs(columns = c("water", "tree", "aggress"), ggplot2::aes(colour=aggress, alpha = 0.8))`

(ref:ch7-ggpairs) **Plot matrix of covariates showing frequency plots, boxplots, frequency histograms, and frequency polygons.**

```{r ch7-ggpairs, fig.cap='(ref:ch7-ggpairs)', fig.align='center', fig.dim=c(6, 4), cache = TRUE, message = FALSE, echo=FALSE, warning=FALSE}
cc %>% 
    ggpairs(columns = c("water", "tree", "aggress"), aes(colour=aggress, alpha = 0.8), lower = list(combo = wrap("facethist", binwidth = 15))) + My_theme

```

The plot matrix in Fig. \@ref(fig:ch7-ggpairs) indicates no evidence of collinearity. 

#### Zeros in the response variable {#bern-zeros}

The proportion of zeros in the response variable can be calculated with:

`round((sum(cc$egg == 0) / nrow(cc))*100,0)`

`r round((sum(cc$egg == 0) / nrow(cc))*100,0)`

Unsurprisingly, the response variable contains a high proportion of zeros; 44%. Given that `egg` is a binomial variable, this number of zeros is not problematic.

#### Relationships among dependent and independent variables {#bern-rels}

Visual inspection of the data using plots helps illustrate the nature of relationships among covariates (code for the plot is available in the R script associated with this chapter):

(ref:ch7-scatter) **Multipanel scatterplot of number of the probability of cuckoo parasitism against distance of nests to: A. open water and; B. a tree, for high and low aggression pairs of reed warblers.**

```{r ch7-scatter, fig.cap='(ref:ch7-scatter)', fig.align='center', fig.dim=c(6, 4), cache = TRUE, message = FALSE, echo=FALSE, warning=FALSE}

label_agg <- c("low"   = "Low aggression", 
               "high"  = "High aggression")

water_plot <- ggplot() +
  geom_jitter(data = cc, position = position_jitter
              (width = 0.05, height = 0.05),
              aes(y = egg, x = water, size = 1, alpha = 0.8)) +
  xlab("Distance to open water (m)") + 
  ylab("Probability of cuckoo egg") +
  ylim(-0.1,1.1) +
  theme(text = element_text(size=11))  +
  theme(panel.background = element_blank()) +
  theme(panel.border = element_rect(fill = NA, 
        colour = "black", size = 1)) +
  theme(strip.background = element_rect
        (fill = "white", color = "white", size = 1)) +
  facet_grid(. ~ aggress, 
             scales = "fixed", space = "fixed", 
             labeller=labeller (aggress = label_agg)) +
    theme(strip.text = element_text(size = 12, face="italic")) +
  theme(legend.position = "none")

tree_plot <- ggplot() +
  geom_jitter(data = cc, position = position_jitter
              (width = 0.05, height = 0.05),
              aes(y = egg, x = tree, size = 1, alpha = 0.8)) +
  xlab("Distance to tree (m)") + 
  ylab("") +
  ylim(-0.1,1.1) +
  theme(text = element_text(size=11))  +
  theme(panel.background = element_blank()) +
  theme(panel.border = element_rect(fill = NA, 
                                    colour = "black", size = 1)) +
  theme(strip.background = element_rect
        (fill = "white", color = "white", size = 1)) +
  facet_grid(. ~ aggress, 
             scales = "fixed", space = "fixed", 
             labeller=labeller (aggress = label_agg)) +
    theme(strip.text = element_text(size = 12, face="italic")) +
  theme(legend.position = "none")

# Combine plots
ggarrange(water_plot, tree_plot,
          labels = c("A", "B"),
          ncol = 2, nrow = 1)

```

The plots in Fig. \@ref(fig:ch7-scatter) suggest a greater risk of parasitism in the case of a low aggression response from reed warbler pairs. There is also some indication of a negative effect of tree distance and probability of parasitism. There is no clear effect of distance to open water or of an interaction among variables.

#### Independence of response variable {#bern-depend}

An assumption in fitting a GLM is that each observation is independent of all others. In the dataframe here, each row represents a separate nest guarded by a unique pair of reed warblers. This species is territorial and so, while pairs may have interacted on the edges of their territories, there was a minimum distance between nests and we will treat the data as independent, while recognising that some level of dependency is typical of much ecological data. 

### Selection of a statistical model {#bern-select}

The aim of this study is to identify the contribution of three ecological and behavioural variables to the probability of cuckoo brood parasitism. The dependent variable is binomial with a single observation per ‘trial’; i.e. a nest was either parasitised or not during the breeding season. Strictly then, this is not a binomial distribution but a Bernoulli distribution and will be modelled with a logit link function. The logit link function ensures that fitted model probabilities fall between 0 and 1. 

### Specification of priors {#bern-prior-spec}

Informative priors for the model  were based on data from a pilot study conducted in the year before the main study at the same location.

#### Pilot study

A small-scale trial in which just 7 great reed warbler nests were monitored for cuckoo parasitism was conducted in the year preceding the main trial. All reed warblers in the pilot and main study were ringed and none used in the pilot were subsequently monitored in the main study.

Table 7.1: **Pilot data to investigate predictors of cuckoo parasitism of great reed warbler nests. The association between the distance of nests to open water, the nearest tree, and reed warbler aggressive response to a dummy cuckoo were measured. The incidence of cuckoo brood parasitism of great reed warbler nests was monitored throughout the breeding season (mid-May to mid-July)..**

|Nest|Open water (m)|Tree (m)|Pair aggression|Cuckoo egg|
|:--:|:------------:|:------:|:-------------:|:--------:|
|a   |100           |42      |high           |1         |
|b   |136           |97      |high           |0         |
|c   |175           |61      |low            |0         |
|d   |69            |54      |low            |1         |
|e   |66            |43      |high           |0         |
|f   |84            |21      |low            |1         |
|g   |133           |31      |low            |1         |

*__Import pilot data__*

Pilot data are saved in the file `ccpilot.csv` and are imported into a dataframe in R using:

`ccpilot <- read_csv("ccpilot.csv")`

```{r ch7-csv-pilot, echo=FALSE, warning=FALSE, message=FALSE}
ccpilot <- read_csv("ccpilot.csv")
```

Start by inspecting the dataframe:

```{r ch7-str-pilot, comment = "", cache = TRUE,  message = FALSE, echo=FALSE, warning=FALSE}
str(ccpilot)
```

The dataframe comprises `r nrow(ccpilot)` observations of `r ncol(ccpilot)` variables, which are identical to those in the main study.

#### Model pilot data

We proceed by fitting a Bernoulli GLM to obtain parameter estimates to use as priors. We first formulate the model:

`f01 <- egg ~ water + tree + aggress`

Then fit the model:

`P01 <- inla(f01, family = "binomial", Ntrials = 1, data = ccpilot)`

The numerical output is obtained using:

`P01Betas <- P01$summary.fixed[,c("mean", "sd", "0.025quant", "0.975quant")]`
                                 
`round(P01Betas, digits = 2)`

```{r ch7-betas-pilot, comment = "", cache = TRUE,  message = FALSE, echo=FALSE, warning=FALSE}
# Formulate model
f01 <- egg ~ water + tree + aggress

# Run with INLA
P01 <- inla(f01,
            family = "binomial", 
            Ntrials = 1,
            data = ccpilot)

# Obtain summary of fixed effects
P01Betas <- P01$summary.fixed[,c("mean", "sd", 
                                 "0.025quant", 
                                 "0.975quant")] 
round(P01Betas, digits = 2)
```

The output provides parameter estimates that can be incorporated into our Bayesian model as informative priors for the full analysis.

#### Priors on the fixed effects {#bern-priors-fixed}

Non-informative (default) priors were put on the fixed effects for model `M01`, which were: 

$\beta intercept$ ~ _N_(0, 0) ($\tau$ = 0)

$\beta water$ ~ _N_(0, 1000) ($\tau$ = 0.001)

$\beta tree$ ~ _N_(0, 1000) ($\tau$ = 0.001)

$\beta aggressionlow$ ~ _N_(0, 1000) ($\tau$ = 0.001)

The findings from the pilot data can be specified in model `I01` as informative priors on the fixed effects as: 

$\beta intercept$ ~ _N_(1.68, 3.53) ($\tau$ = 0.28)

$\beta water$ ~ _N_(-0.01, 0.0004) ($\tau$ = 2500)

$\beta tree$ ~ _N_(-0.03, 0.0009) ($\tau$ = 1111)

$\beta aggressionlow$ ~ _N_(1.83, 2.56) ($\tau$ = 0.39)

### Fit the models {#bern-fit-models}

We fit the Bayesian Bernoulli GLM using INLA with default priors (`M01`), and informative priors on the fixed effects (`I01`).

The model formulae (`f01`) is:

`f01 <- egg ~ water + tree + aggress`

We fit the default model `M01`, specifying `control.compute = list(dic = TRUE)` to enable model comparison:

`M01 <- inla(f01, family = "binomial", Ntrials = 1, data = cc, control.compute = list(dic = TRUE))`

Then fit the model with informative priors:

`I01 <- inla(f01, family = "binomial", Ntrials = 1, data = cc, control.compute = list(dic = TRUE), control.fixed = list(mean.intercept = 1.68, prec.intercept = 1.88^(-2), mean = list(water = -0.01, tree = -0.03, aggresslow = 1.83), prec = list(water = 0.02^(-2), tree  = 0.03^(-2), aggresslow = 1.6^(-2))))`

```{r ch7-fit-models, comment = "", cache = TRUE,  message = FALSE, echo=FALSE, warning=FALSE}

# Specify model formula
f01 <- egg ~ water + tree + aggress

# Model M01 with default priors
M01 <- inla(f01, family = "binomial", 
                 Ntrials = 1, data = cc,
                 control.compute = list(dic = TRUE))

# Models I01 with informative priors (based on pilot)

I01 <- inla(f01, family = "binomial", Ntrials = 1, data = cc,
            control.compute = list(dic = TRUE),
            control.fixed = list(mean.intercept = 1.68,
                                 prec.intercept = 1.88^(-2),
                              mean = list(water = -0.01,
                                           tree = -0.03,
                                     aggresslow = 1.83),
                              prec = list(water = 0.02^(-2),
                                          tree  = 0.03^(-2),
                                     aggresslow = 1.6^(-2))))
```

### Obtain the posterior distribution {#bern-post-dist}

#### Model with default priors

Output for the fixed effects of  M01 can be obtained with:

`M01Betas <- M01$summary.fixed[,c("mean", "sd", "0.025quant",  "0.975quant")]`
                                 
`round(M01Betas, digits = 2)`

```{r ch7-def-post, comment = "", cache = TRUE,  message = FALSE, echo=FALSE, warning=FALSE}
M01Betas <- M01$summary.fixed[,c("mean", "mode", "sd",
                                 "0.025quant", 
                                 "0.975quant")] 
round(M01Betas, digits = 2)
```

This reports the posterior mean, standard deviation and 95% credible intervals for the model intercept and covariates.

For the slope of the variable `water` the posterior mean is `r round(M01Betas$mean[2], 2)`, lower 95% credible interval `r round(M01Betas$'0.025quant'[2], 2)` and upper 95% credible interval `r round(M01Betas$'0.975quant'[2], 2)`.  This result indicates that we can be 95% certain that the posterior mean of the regression parameter for the slope of `water` falls between these credible intervals, which means `water` is statistically important in the model with uninformative priors.

We can similarly conclude that there is a statistically important negative effect of distance to the nearest tree and a positive effect of parental reed warbler aggressive response on the probability of cuckoo parasitism.

The posterior distributions of the fixed effects can be visualized using `ggplot2`. See the R script associated with this chapter.

(ref:ch7-M01-betas) **Posterior and prior distributions for fixed parameters of a Bayesian Bernoulli GLM to model the probability of common cuckoo brood parasitism of great reed warbler nests. The model is fitted with default (non-informative) priors. Distributions for: A. model intercept; B. slope for distance to open water; C. slope for distance to nearest tree; D. effect of parental reed warbler aggressive response. The solid black line is the posterior distribution, the solid gray line is the prior distribution, the gray shaded area encompasses the 95% credible intervals, the vertical dashed line is the posterior mode of the parameter, the vertical dotted line indicates zero. For parameters where zero (indicated by dotted line) falls outside the range of the 95% credible intervals (gray shaded area), the parameter is statistically important.**

```{r ch7-M01-betas, fig.cap='(ref:ch7-M01-betas)', fig.align='center', fig.dim=c(6, 4), cache = TRUE,  message = FALSE, echo=FALSE, warning=FALSE}

PosteriorBeta1.M01 <- as.data.frame(M01$marginals.fixed$`(Intercept)`)
PriorBeta1.M01     <- data.frame(x = PosteriorBeta1.M01[,"x"], 
                           y = dnorm(PosteriorBeta1.M01[,"x"],0,0))
Beta1mean.M01 <- M01Betas["(Intercept)", "mode"]
Beta1lo.M01   <- M01Betas["(Intercept)", "0.025quant"]
Beta1up.M01   <- M01Betas["(Intercept)", "0.975quant"]

beta1 <- ggplot() +
  annotate("rect", xmin = Beta1lo.M01, xmax = Beta1up.M01,
           ymin = 0, ymax = 0.011, fill = "gray88") +
  geom_line(data = PosteriorBeta1.M01,
            aes(y = y, x = x), lwd = 1.2) +
  geom_line(data = PriorBeta1.M01,
            aes(y = y, x = x), color = "gray55", lwd = 1.2) +
  xlab("Intercept") + ylab("Density") +
  xlim(0,410) + ylim(0,0.011) +
  geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = Beta1mean.M01, linetype = "dashed") +
  theme(text = element_text(size=13)) +
  theme(panel.background = element_blank()) +
  theme(panel.border = element_rect(fill = NA, 
                  colour = "black", size = 1)) +
  theme(strip.background = element_rect
       (fill = "white", color = "white", size = 1))
# beta1

# water (Beta2)
PosteriorBeta2.M01 <- as.data.frame(M01$marginals.fixed$`water`)
PriorBeta2.M01 <- data.frame(x = PosteriorBeta2.M01[,"x"], 
                       y = dnorm(PosteriorBeta2.M01[,"x"],0,0.001))
Beta2mean.M01 <- M01Betas["water", "mode"]
Beta2lo.M01   <- M01Betas["water", "0.025quant"]
Beta2up.M01   <- M01Betas["water", "0.975quant"]

beta2 <- ggplot() +
  annotate("rect", xmin = Beta2lo.M01, xmax = Beta2up.M01,
         ymin = 0, ymax = 2.5, fill = "gray88") +
  geom_line(data = PosteriorBeta2.M01,
            aes(y = y, x = x), lwd = 1.2) +
  geom_line(data = PriorBeta2.M01,
            aes(y = y, x = x), color = "gray55", lwd = 1.2) +
  xlab("Slope for water") + ylab("Density") +
  xlim(-0.5,1.5) + ylim(0,2.5) +
  geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = Beta2mean.M01, linetype = "dashed") +
  theme(text = element_text(size=13)) +
  theme(panel.background = element_blank()) +
  theme(panel.border = element_rect(fill = NA, 
                  colour = "black", size = 1)) +
  theme(strip.background = element_rect
       (fill = "white", color = "white", size = 1))
# beta2

# tree (Beta3)
PosteriorBeta3.M01 <- as.data.frame(M01$marginals.fixed$`tree`)
PriorBeta3.M01     <- data.frame(x = PosteriorBeta3.M01[,"x"],
                           y = dnorm(PosteriorBeta3.M01[,"x"],0,0.001))

Beta3mean.M01 <- M01Betas["tree", "mode"]
Beta3lo.M01   <- M01Betas["tree", "0.025quant"]
Beta3up.M01   <- M01Betas["tree", "0.975quant"]

beta3 <- ggplot() +
  annotate("rect", xmin = Beta3lo.M01, xmax = Beta3up.M01,
           ymin = 0, ymax = 0.35, fill = "gray88") +
  geom_line(data = PosteriorBeta3.M01,
            aes(y = y, x = x), lwd = 1.2) +
  geom_line(data = PriorBeta3.M01,
            aes(y = y, x = x), color = "gray55", lwd = 1.2) +
  xlab("Slope for tree") + ylab("Density") +
  xlim(-12,2) + ylim(0,0.35) +
  geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = Beta3mean.M01, linetype = "dashed") +
  theme(text = element_text(size=13)) +
  theme(panel.background = element_blank()) +
  theme(panel.border = element_rect(fill = NA, 
                  colour = "black", size = 1)) +
  theme(strip.background = element_rect
       (fill = "white", color = "white", size = 1))
# beta3

# aggresslow (beta4)
PosteriorBeta4.M01 <- as.data.frame(M01$marginals.fixed$`aggresslow`)
PriorBeta4.M01     <- data.frame(x = PosteriorBeta4.M01[,"x"],
                                 y = dnorm(PosteriorBeta4.M01[,"x"],0,0))
Beta4mean.M01 <- M01Betas["aggresslow", "mode"]
Beta4lo.M01   <- M01Betas["aggresslow", "0.025quant"]
Beta4up.M01   <- M01Betas["aggresslow", "0.975quant"]

beta4 <- ggplot() +
  annotate("rect", xmin = Beta4lo.M01, xmax = Beta4up.M01,
           ymin = 0, ymax = 0.038, fill = "gray88") +
  geom_line(data = PosteriorBeta4.M01,
            aes(y = y, x = x), lwd = 1.2) +
  geom_line(data = PriorBeta4.M01,
            aes(y = y, x = x), color = "gray55", lwd = 1.2) +
  xlab("Aggression") + ylab("Density") +
  xlim(40,160) + ylim(0,0.038) +
  geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = Beta4mean.M01, linetype = "dashed") +
  theme(text = element_text(size=13)) +
  theme(panel.background = element_blank()) +
  theme(panel.border = element_rect(fill = NA, 
                  colour = "black", size = 1)) +
  theme(strip.background = element_rect
       (fill = "white", color = "white", size = 1))
# beta4

# Combine plots (Fig 7.4)
ggarrange(beta1, beta2, beta3, beta4,
                        labels = c("A", "B", "C", "D"),
                        ncol = 2, nrow = 2)
```

Figure \@ref(fig:ch7-M01-betas) provides a visual summary of the model fixed effects, and indicates that for model M01 the betas are all statistically important. Note that the posterior distributions are asymmetric; this feature reflects a problem with the model that is addressed in Section \@ref(bern-freq-comp).

#### Model with informative priors {#bern-inf-priors}

The output for the model with informative priors is obtained with:

`I01Betas <- I01$summary.fixed[,c("mean", "mode", "sd", "0.025quant", "0.975quant")]` 
`round(I01Betas, digits = 2)`

```{r ch7-Inf-betas, comment = "", cache = TRUE,  message = FALSE, echo=FALSE, warning=FALSE}
I01Betas <- I01$summary.fixed[,c("mean", "mode", "sd", "0.025quant", "0.975quant")] 
                                 
round(I01Betas, digits = 2)

```

The results for the informative model differ quantitatively and quantitatively from the model with default priors. Parameter estimates are strikingly different and distance to open water is not statistically important in the model. 

We can calculate the posterior odds ratio of model parameters by exponentiating model coefficients. For `tree` the odds ratio is _exp_(-0.063) = 0.94; thus we can predict that there is an average 6% (= 1 - 0.94) decrease in the odds of a reed warbler nest receiving a cuckoo egg for each 1 m increase in distance from the nearest tree, with 95% certainty that this estimate falls between 2-10%. The prior odds ratio for `tree` is 3%.

The posterior distributions can be visualized with `ggplot2`: see the R script associated with this chapter.

(ref:ch7-I01-betas) **Posterior and prior distributions for fixed parameters of a Bayesian Bernoulli GLM to model the probability of common cuckoo brood parasitism of great reed warbler nests fitted with informative priors. Distributions for: A. model intercept; B. slope for distance to open water; C. slope for distance to nearest tree; D. effect of parental reed warbler aggressive response. The solid black line is the posterior distribution, the solid gray line is the prior distribution, the gray shaded area encompasses the 95% credible intervals, the vertical dashed line is the posterior mode of the parameter, the vertical dotted line indicates zero. For parameters where zero (indicated by dotted line) falls outside the range of the 95% credible intervals (gray shaded area), the parameter is statistically important.**

```{r ch7-I01-betas, fig.cap='(ref:ch7-I01-betas)', fig.align='center', fig.dim=c(6, 4), cache = TRUE,  message = FALSE, echo=FALSE, warning=FALSE}

I01Betas <- I01$summary.fixed[,c("mean", "mode", "sd", 
                                 "0.025quant", 
                                 "0.975quant")] 

# Plot posterior distributions for fixed parameters 

# Model intercept (Beta1)
PosteriorBeta1.I01 <- as.data.frame(I01$marginals.fixed$`(Intercept)`)
PriorBeta1.I01     <- data.frame(x = PosteriorBeta1.I01[,"x"], 
                                 y = dnorm(PosteriorBeta1.I01[,"x"],1.68,1.88))
Beta1mean.I01 <- I01Betas["(Intercept)", "mode"]
Beta1lo.I01   <- I01Betas["(Intercept)", "0.025quant"]
Beta1up.I01   <- I01Betas["(Intercept)", "0.975quant"]

Ibeta1 <- ggplot() +
  annotate("rect", xmin = Beta1lo.I01, xmax = Beta1up.I01,
           ymin = 0, ymax = 0.38, fill = "gray88") +
  geom_line(data = PosteriorBeta1.I01,
            aes(y = y, x = x), lwd = 1.2) +
  geom_line(data = PriorBeta1.I01,
            aes(y = y, x = x), color = "gray55", lwd = 1.2) +
  xlab("Intercept") + ylab("Density") +
  xlim(-3,7.5) + ylim(0,0.38) +
  geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = Beta1mean.I01, linetype = "dashed") +
  theme(text = element_text(size=13)) +
  theme(panel.background = element_blank()) +
  theme(panel.border = element_rect(fill = NA, 
                                    colour = "black", size = 1)) +
  theme(strip.background = element_rect
        (fill = "white", color = "white", size = 1))
# Ibeta1

# water (Beta2)
PosteriorBeta2.I01 <- as.data.frame(I01$marginals.fixed$`water`)
PriorBeta2.I01 <- data.frame(x = PosteriorBeta2.I01[,"x"], 
                             y = dnorm(PosteriorBeta2.I01[,"x"], -0.01, 0.02))
Beta2mean.I01 <- I01Betas["water", "mode"]
Beta2lo.I01   <- I01Betas["water", "0.025quant"]
Beta2up.I01   <- I01Betas["water", "0.975quant"]

Ibeta2 <- ggplot() +
  annotate("rect", xmin = Beta2lo.I01, xmax = Beta2up.I01,
           ymin = 0, ymax = 45, fill = "gray88") +
  geom_line(data = PosteriorBeta2.I01,
            aes(y = y, x = x), lwd = 1.2) +
  geom_line(data = PriorBeta2.I01,
            aes(y = y, x = x), color = "gray55", lwd = 1.2) +
  xlab("Slope for water") + ylab("Density") +
  xlim(-0.06,0.05) + ylim(0,45) +
  geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = Beta2mean.I01, linetype = "dashed") +
  theme(text = element_text(size=13)) +
  theme(panel.background = element_blank()) +
  theme(panel.border = element_rect(fill = NA, 
                                    colour = "black", size = 1)) +
  theme(strip.background = element_rect
        (fill = "white", color = "white", size = 1))
# Ibeta2

# status (Beta3)
PosteriorBeta3.I01 <- as.data.frame(I01$marginals.fixed$`tree`)
PriorBeta3.I01     <- data.frame(x = PosteriorBeta3.I01[,"x"],
                                 y = dnorm(PosteriorBeta3.I01[,"x"],-0.03, 0.03))

Beta3mean.I01 <- I01Betas["tree", "mode"]
Beta3lo.I01   <- I01Betas["tree", "0.025quant"]
Beta3up.I01   <- I01Betas["tree", "0.975quant"]

Ibeta3 <- ggplot() +
  annotate("rect", xmin = Beta3lo.I01, xmax = Beta3up.I01,
           ymin = 0, ymax = 22, fill = "gray88") +
  geom_line(data = PosteriorBeta3.I01,
            aes(y = y, x = x), lwd = 1.2) +
  geom_line(data = PriorBeta3.I01,
            aes(y = y, x = x), color = "gray55", lwd = 1.2) +
  xlab("Slope for tree") + ylab("Density") +
  xlim(-0.15,0.05) + ylim(0,22) +
  geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = Beta3mean.I01, linetype = "dashed") +
  theme(text = element_text(size=13)) +
  theme(panel.background = element_blank()) +
  theme(panel.border = element_rect(fill = NA, 
                                    colour = "black", size = 1)) +
  theme(strip.background = element_rect
        (fill = "white", color = "white", size = 1))
# Ibeta3

# aggresslow (beta4)

PosteriorBeta4.I01 <- as.data.frame(I01$marginals.fixed$`aggresslow`)
PriorBeta4.I01     <- data.frame(x = PosteriorBeta4.I01[,"x"],
                                 y = dnorm(PosteriorBeta4.I01[,"x"],1.83, 1.6))
Beta4mean.I01 <- I01Betas["aggresslow", "mode"]
Beta4lo.I01   <- I01Betas["aggresslow", "0.025quant"]
Beta4up.I01   <- I01Betas["aggresslow", "0.975quant"]

Ibeta4 <- ggplot() +
  annotate("rect", xmin = Beta4lo.I01, xmax = Beta4up.I01,
           ymin = 0, ymax = 0.42, fill = "gray88") +
  geom_line(data = PosteriorBeta4.I01,
            aes(y = y, x = x), lwd = 1.2) +
  geom_line(data = PriorBeta4.I01,
            aes(y = y, x = x), color = "gray55", lwd = 1.2) +
  xlab("Aggression") + ylab("Density") +
  xlim(-2.5,7.5) + ylim(0,0.42) +
  geom_vline(xintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = Beta4mean.I01, linetype = "dashed") +
  theme(text = element_text(size=13)) +
  theme(panel.background = element_blank()) +
  theme(panel.border = element_rect(fill = NA, 
                                    colour = "black", size = 1)) +
  theme(strip.background = element_rect
        (fill = "white", color = "white", size = 1))
# Ibeta4

# Combine plots
ggarrange(Ibeta1, Ibeta2, Ibeta3, Ibeta4,
                        labels = c("A", "B", "C", "D"),
                          ncol = 2, nrow = 2)
```

Figure \@ref(fig:ch7-I01-betas) illustrates that for model I01 the betas for model intercept, tree and aggression all differ from zero and are statistically important. In contrast, the slope for water does not differ from zero. Note also that the posterior distributions are normally distributed, in contrast to those for the model with non-informative priors (Fig. \@ref(fig:ch7-M01-betas)).

#### Comparison of models with uninformative and informative priors {#bern-prior-comp}

We can compare the results of the Bernoulli GLMs with uninformative and informative priors using the DIC:

First extract DICs:

`InfDIC <- c(M01$dic$dic, I01$dic$dic)`

Add weighting:

`InfDIC.weights <- aicw(InfDIC)`

Add names:

`rownames(InfDIC.weights) <- c("default","informative")`

Print DICs:

`dprint.inf <- print (InfDIC.weights, abbrev.names = FALSE)`

Order DICs by fit:

`round(dprint.inf[order(dprint.inf$fit),],2)`

```{r ch7-DIC, comment = "", cache = TRUE,  message = FALSE, echo=FALSE, warning=FALSE}
# Extract DICs
InfDIC <- c(M01$dic$dic, I01$dic$dic)

# Add weighting
InfDIC.weights <- aicw(InfDIC)

# Add names
rownames(InfDIC.weights) <- c("default","informative")

# Print DICs
dprint.inf <- print (InfDIC.weights,
                     abbrev.names = FALSE)

# Order DICs by fit
round(dprint.inf[order(dprint.inf$fit),],2)
```

The DIC score for the informative model is substantially lower than that for the model with uninformative priors. Model weight ($\omega$) is the probability that a given model is the best model (based on DIC), given the data and the alternative models. In this case the model with informative priors is the most probable of the two.

#### Comparison with frequentist Bernoulli GLM {#bern-freq-comp}

We can compare the results of the Bayesian Poisson GLMs with the same model fitted in a frequentist setting. Execution of the model in a frequentist framework can be performed with:

`Freq <- glm(egg ~ water + tree + aggress, family = binomial (link = "logit"), data = cc)`

`round(summary(Freq)$coef[,1:4],2)`

```{r ch7-freq_comp, comment = "", cache = TRUE,  message = FALSE, echo=FALSE, warning=TRUE}
Freq <- suppressWarnings(glm(egg ~ water + tree + aggress,
                  family = binomial(link = "logit"),
                  data = cc))

round(summary(Freq)$coef[,1:4],2)
```
`Warning:`  
`glm.fit: algorithm did not converge`

`Warning:`  
`glm.fit: fitted probabilities numerically 0 or 1 occurred`

Note the warning messages. These warnings arise because some of the fitted probabilities are extremely close to zero or one. Note that the standard errors are unrealistically large, making model estimates unreliable.

We can compare this output with the results for the two Bayesian models:

Table 7.2: **Comparison of model parameters for frequentist, Bayesian model with non-informative and informative priors of Bernoulli GLM model to investigate common cuckoo brood parasitism of great reed warblers in South Moravia.**

|Model             |Intercept  |water      |tree       |aggression  |
|:-----------------|:---------:|:---------:|:---------:|:---------:|
|Frequentist       |185(480365)|0.39(3689) |-5.71(13125)|82.4(346541)|
|Bayesian (default)|196(45.0)  |0.37(0.18) |-5.41(1.41)|98.6(13.3)|
|Bayesian (inform) |2.90(1.23) |-0.01(0.01)|-0.06(0.02)|2.47(1.08)|

Parameter estimates for the frequentist and Bayesian model with non-informative priors are somewhat similar, though the variances for model parameters for each are decidedly different. Results for the Bayesian model with informative priors are starkly different to the results for the other two models.

How should we view these contrasting results? The failure of the frequentist model to converge highlights a problem with the model that is not resolved by fitting a Bayesian model with default priors. It is noteworthy that putting informative priors on model parameters had the effect of stabilising estimates, and the inclusion of at least weakly informative priors in this situation is a recommended solution to this problem [@Gelman_2008]. As a consequence, the model with informative priors is the one on which we can place greater reliance and with which we will conduct model checks.

### Conduct model checks

After model fitting and obtaining the posterior distributions, a next step is validation of the model through model checks. However, model validation of Bernoulli GLMs is rather difficult, given the binary nature of the response variable.

#### Model selection using the Deviance Information Criterion (DIC) {#bern-dic}

We can perform a simple model selection by removing model parameters and comparing models using the DIC. Though also consider that model selection is not an obligatory step, and leaving the model as originally formulated is perfectly acceptable; see a fuller discussion of model selection in [@Smith_etal_2020]. We start by formulating a set of alternative models:

`f01 <- egg ~ water + tree + aggress`

`f02 <- egg ~ water + tree`

`f03 <- egg ~ water + aggress`

`f04 <- egg ~ tree + aggress`

To use DIC we must re-run the model with informative priors and specify `control.compute = list(dic = TRUE)`. See the R script associated with this chapter.

Compare models with the DIC:

```{r ch7-bern-dic, comment = "", cache = TRUE,  message = FALSE, echo=FALSE, warning=FALSE}
f01 <- egg ~ water + tree + aggress
f02 <- egg ~ water + tree
f03 <- egg ~ water + aggress
f04 <- egg ~ tree + aggress

# To use DIC we must re-run the model and specify its calculation using 
# 'control.compute'

# Compare models with informative priors
I01.full <- inla(f01, family = "binomial", Ntrials = 1, data = cc,
                     control.compute = list(dic = TRUE),
                     control.fixed = list(mean.intercept = 1.68,
                                          prec.intercept = 1.88^(-2),
                                          mean = list(water = -0.01,
                                                      tree = -0.03,
                                                      aggresslow = 1.83),
                                          prec = list(water = 0.02^(-2),
                                                      tree  = 0.03^(-2),
                                                      aggresslow = 1.6^(-2))))

I01.1 <- inla(f02, family = "binomial", Ntrials = 1, data = cc,
              control.compute = list(dic = TRUE),
              control.fixed = list(mean.intercept = 1.68,
                                   prec.intercept = 1.88^(-2),
                                   mean = list(water = -0.01,
                                               tree = -0.03,
                                               aggresslow = 1.83),
                                   prec = list(water = 0.02^(-2),
                                               tree  = 0.03^(-2),
                                               aggresslow = 1.6^(-2))))

I01.2 <- inla(f03, family = "binomial", Ntrials = 1, data = cc,
              control.compute = list(dic = TRUE),
              control.fixed = list(mean.intercept = 1.68,
                                   prec.intercept = 1.88^(-2),
                                   mean = list(water = -0.01,
                                               tree = -0.03,
                                               aggresslow = 1.83),
                                   prec = list(water = 0.02^(-2),
                                               tree  = 0.03^(-2),
                                               aggresslow = 1.6^(-2))))

I01.3 <- inla(f04, family = "binomial", Ntrials = 1, data = cc,
              control.compute = list(dic = TRUE),
              control.fixed = list(mean.intercept = 1.68,
                                   prec.intercept = 1.88^(-2),
                                   mean = list(water = -0.01,
                                               tree = -0.03,
                                               aggresslow = 1.83),
                                   prec = list(water = 0.02^(-2),
                                               tree  = 0.03^(-2),
                                               aggresslow = 1.6^(-2))))

# Compare models with the DIC
I01dic <- c(I01.full$dic$dic, I01.1$dic$dic, 
            I01.2$dic$dic,    I01.3$dic$dic)
DIC <- cbind(I01dic)
rownames(DIC) <- c("water + tree + aggress", 
                   "water + tree",
                   "water + aggress", 
                   "tree + aggress")
round(DIC,1)
```

The full model (`water` + `tree` + `aggress`) and the model with water dropped (`tree` + `aggress`) provide the best fit, but the two are essentially indistinguishable. In this case we will choose the simpler of the two (i.e. `tree` + `aggress`).

We can repeat this process to further refine the model (see R script associated with this chapter), though this process demonstrates that dropping further variables does not improve model fit based on the DIC.

#### Posterior predictive checks {#bern-ppc}

Posterior predictive checks can be used to assess whether a model generates realistic predictions by drawing simulated estimates from the joint posterior predictive distribution and comparing them with observed data with a posterior predictive p-value. In the case of a Bernoulli distribution, this is a problematic since the outcome can take values only of 0 or 1.

Here we compare the predicted outcome for each reed warbler nest in terms of whether they were parasitised by a cuckoo egg or not, with observed parasitism. See the R script associated with this chapter for estimating and plotting the posterior predictive p-values for the model.

(ref:ch7-ppcplot) **Plot of the posterior predictive p-values (ppp) and observed occurrence of cuckoo eggs in reed warbler nests for the Bayesian Bernoulli GLM with informative priors for each nest in the study. Black points are observed outcomes and gray points are simulated model predictions.**

```{r ch7-ppcplot, fig.cap='(ref:ch7-ppcplot)', fig.align='center', fig.dim=c(6, 4), cache = TRUE,  message = FALSE, echo=FALSE, warning=FALSE}

I01.pred <- inla(f04, family = "binomial", Ntrials = 1, data = cc,
                 control.predictor = list(link = 1,
                                          compute = TRUE),
                 control.compute = list(dic = TRUE, 
                                        cpo = TRUE),
                 control.fixed = list(mean.intercept = 1.68,
                                      prec.intercept = 1.88^(-2),
                                      mean = list(water = -0.01,
                                                  tree = -0.03,
                                                  aggresslow = 1.83),
                                      prec = list(water = 0.02^(-2),
                                                  tree  = 0.03^(-2),
                                                  aggresslow = 1.6^(-2))))

ppp <- vector(mode = "numeric", length = nrow(cc))
for(i in (1:nrow(cc))) {
  ppp[i] <- inla.pmarginal(q = cc$egg[i],
                           marginal = I01.pred$marginals.fitted.values[[i]])
}


cc$ppp <- ppp-0.04
ggplot() + 
  geom_vline(xintercept = 1:18, linetype = "dotted") +
  geom_point(data = cc, aes(y = egg, x = nest),
              shape = 19, size = 4, colour = "black",) +
  geom_point(data = cc, aes(y = ppp, x = nest),
              shape = 19, size = 4, colour = "gray60",) +
  ylab("ppp/egg") + xlab("Nest number") +
  scale_x_continuous(breaks = c(1:18)) +
  theme(text = element_text(size=15)) +
  theme(panel.background = element_blank()) +
  theme(panel.border = element_rect(fill = NA, 
                                    colour = "black", size = 1)) +
  theme(strip.background = element_rect
        (fill = "white", color = "white", size = 1))
```

The plot of posterior predictive p-values and observed cuckoo parasitism of nests in Fig. \@ref(fig:ch7-ppcplot) shows that simulated and observed nest outcomes correspond in every case.

#### Bayesian residuals analysis {#bern-resids}

The homogeneity of residual variance can be assessed visually by plotting model residual variance against fitted values as well as each variable in the model (see the R script associated with this chapter).

(ref:ch7-bern-resids) **Bayesian residuals plotted against: A. fitted values; B. distance to nearest tree; and C. parental aggression, to assess homogeneity of residual variance.**

```{r ch7-bern-resids, fig.cap='(ref:ch7-bern-resids)', fig.align='center', fig.dim=c(6, 4), cache = TRUE,  message = FALSE, echo=FALSE, warning=FALSE}

Fit <- I01.pred$summary.fitted.values[, "mean"]

# Calculate residuals
Res     <- cc$egg - Fit
ResPlot <- cbind.data.frame(Fit,Res,cc$tree,cc$aggress)

# Plot residuals against fitted
Res1 <- ggplot(ResPlot, aes(x=Fit, y=Res)) + 
  geom_point(shape = 19, size = 3) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  ylab("Bayesian residuals") + xlab("Fitted values") +
  theme(text = element_text(size=13)) +
  theme(panel.background = element_blank()) +
  theme(panel.border = element_rect(fill = NA, 
                                    colour = "black", size = 1)) +
  theme(strip.background = element_rect
        (fill = "white", color = "white", size = 1))

# And plot residuals against variables in the model
Res2 <- ggplot(ResPlot, aes(x=cc$tree, y=Res)) + 
  geom_point(shape = 19, size = 3) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  ylab("") + xlab("Nearest tree (m)") +
  theme(text = element_text(size=13)) +
  theme(panel.background = element_blank()) +
  theme(panel.border = element_rect(fill = NA, 
                                    colour = "black", size = 1)) +
  theme(strip.background = element_rect
        (fill = "white", color = "white", size = 1))

Res3 <- ggplot(ResPlot, aes(x=cc$aggress, y=Res)) + 
  geom_boxplot(fill = "grey88", colour = "black") +
  geom_hline(yintercept = 0, linetype = "dashed") +
  ylab("") + xlab("Aggression") +
  theme(text = element_text(size=13)) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  theme(panel.background = element_blank()) +
  theme(panel.border = element_rect(fill = NA, 
               colour = "black", size = 1)) +
  theme(strip.background = element_rect
        (fill = "white", color = "white", size = 1)) +
  theme(legend.position = "none")

# Combine plots
# Fig. 7.7
ggarrange(Res1, Res2, Res3,
                    labels = c("A", "B", "C"),
                    ncol = 3, nrow = 1)

```

Residuals plots for Bernoulli models are difficult to interpret, particularly if the number of observations is low (as in this case). Ideally, the distribution of residuals around zero should be random along the horizontal axis and in the case of a categorical variables (such as Fig. \@ref(fig:ch7-bern-resids)C) the median of a boxplot of residuals should be approximately zero. For Figs. \@ref(fig:ch7-bern-resids)A-C the pattern of residuals is not perfect, but appears acceptable.

#### Prior sensitivity analysis {#bern-sens}

A sensitivity analysis involves systematically changing prior distributions and examining the magnitude of outcome for the posterior distribution. 

We investigated prior sensitivity by increasing and decreasing priors on the fixed effects by 20% and examined the outcome for the posterior mean (see the R script associated with this chapter the full analysis).

Results for model with informative priors:

```{r ch7-sense, cache = TRUE,  message = FALSE, echo=FALSE, warning=FALSE}

# Model with priors unchanged
I01.un <- inla(f04, family = "binomial", Ntrials = 1, data = cc,
                control.predictor = list(link = 1,
                                         compute = TRUE),
                control.compute = list(dic = TRUE, 
                                       cpo = TRUE),
                control.fixed = list(mean.intercept = 1.68,
                                     prec.intercept = 1.88^(-2),
                                     mean = list(tree = -0.03,
                                                 aggresslow = 1.83),
                                     prec = list(tree  = 0.03^(-2),
                                                 aggresslow = 1.6^(-2))))

Betas.un <- I01.un$summary.fixed[,c("mean", "sd", 
                                    "0.025quant", 
                                    "0.975quant")] 
round(Betas.un, 2)
```

Results for model with informative priors increased by 20%:

```{r ch7-sense-increased, cache = TRUE,  message = FALSE, echo=FALSE, warning=FALSE}
#== 1. Increase priors by 20% ==
#   Intercept from 1.68 to 2.02 (sd 1.88 to 2.26)
#   tree from -0.03 to -0.04 (sd 0.03 to 0.036)
#   aggresslow from 1.83 to 2.2 (sd 1.60 to 1.92)


M1.plus20 <- inla(f04, family = "binomial", Ntrials = 1, data = cc,
                  control.predictor = list(link = 1,
                                           compute = TRUE),
                  control.compute = list(dic = TRUE, 
                                         cpo = TRUE),
                  control.fixed = list(mean.intercept = 2.02,
                                       prec.intercept = 2.26^(-2),
                                       mean = list(tree = -0.04,
                                                   aggresslow = 2.2),
                                       prec = list(tree  = 0.036^(-2),
                                                   aggresslow = 1.92^(-2))))
  
Betas.plus20 <- M1.plus20$summary.fixed[,c("mean", "sd", 
                                           "0.025quant", 
                                           "0.975quant")] 
round(Betas.plus20, 2)
```

Results for model with informative priors decreased by 20%:

```{r ch7-sense-decreased, cache = TRUE,  message = FALSE, echo=FALSE, warning=FALSE}

#== 2. Decrease priors by 20% ==
#   Intercept from 1.68 to 1.34 (sd 1.88 to 1.50)
#   tree from -0.03 to -0.024 (sd 0.03 to 0.024)
#   aggresslow from 1.83 to 1.46 (sd 1.60 to 1.28)

M1.minus20 <- inla(f04, family = "binomial", Ntrials = 1, data = cc,
                  control.predictor = list(link = 1,
                                           compute = TRUE),
                  control.compute = list(dic = TRUE, 
                                         cpo = TRUE),
                  control.fixed = list(mean.intercept = 1.34,
                                       prec.intercept = 1.50^(-2),
                                       mean = list(tree = -0.024,
                                                   aggresslow = 1.46),
                                       prec = list(tree  = 0.024^(-2),
                                                   aggresslow = 1.28^(-2))))

#Re-run model and obtain estimates of betas
Betas.minus20 <- M1.minus20$summary.fixed[,c("mean", "sd", 
                                             "0.025quant", 
                                             "0.975quant")] 
round(Betas.minus20, digits = 2)

```

Table 7.3: **Sensitivity analysis for a 20% increase and decrease in informative priors on fixed effects and the % change in the posterior mean.**

|Parameter|% prior|Mean  |0.025CI|0.975CI|% posterior|
|:--------|:-----:|:----:|:-----:|:-----:|:---------:|
|         | +20   |2.83  |0.61   |5.23   |+24        |
|Intercept| 0     |2.27  |0.32   |4.34   |0          |
|         | -20   |1.77  |0.09   |3.52   |-22        |
|         |       |      |       |       |           |
|         | +20   |-0.083|-0.133 |-0.038 |+22        |
|tree     | 0     |-0.068|-0.110 |-0.030 |0          |
|         | -20   |-0.054|-0.088 |-0.023 |-21        |
|         |       |      |       |       |           |
|         | +20   |2.38  |0.25   |4.73   |+13        |
|aggress  | 0     |2.11  |0.21   |4.17   |0          |
|         | -20   |1.83  |0.17   |3.59   |-13        |

The results of the prior sensitivity analysis (Table 7.3) show that increases or decreases of 20% in prior means result in equivalent or slightly lesser changes to the posterior distribution.

#### Conclusions from model checks {#bern-checkconc}

Model checks on Bernoulli GLMs are less straightforward than those for other distributions. However, the GLM with informative priors showed a substantially better goodness-of-fit than the model with default priors, which was a consequence of the priors serving to stabilise estimates. Posterior predictive checks confirmed that the model predictive distributions matched the data. Residual analysis failed to highlight any systematic problems with model fit. Prior sensitivity analysis demonstrated that the model was sensitive to changes in prior distributions of fixed effects, though in proportion to the changes to priors. Overall, then, the Bayesian Bernoulli GLM with informative priors appears to provides a reasonable representation of the data.

### Interpret and present model output	{#bern-present}

Specification of the Bayesian Bernoulli GLM takes the form:

$Egg_{i}$ ~ $Binomial(\pi_{i}, n_{i})$

_E_($Egg_{i}$) = $n_i$ x $\pi_{i}$  and   var($Egg_{i}$) = $n_i$ x $\pi_{i}$ x $(1 - \pi_{i})$

_logit_ $\pi_i$ = $\eta_{i}$

$\eta_i$ = $\beta_1$ + $\beta_2$ x $tree_{i}$ + $\beta_3$ x $aggression_{i}$ 

Where $Egg_{i}$ is the probability of $nest_{i}$ being parasitised with a cuckoo egg, which is assumed to follow a binomial distribution with an expected probability (_E_) of parasitism of mean $n_i$ x $\pi_{i}$ and variance $n_i$ x $\pi_{i}$ x $(1 - \pi_{i})$, with a logit link function. The logit function ensures the fitted probability of brood parasitism falls between 0 and 1. The model contains a linear effect for distance of nest _i_ to the nearest tree ($tree_{i}$), while $aggression_{i}$  is a categorical covariate with two levels, corresponding with the aggressive response of parental great reed warblers to a cuckoo model, either low or high aggressive response.

The numerical output of the model is:

```{r ch7-bern-final, comment="", echo=FALSE, cache=TRUE, warning=FALSE, message=FALSE}
# The final model is: 
Final <- inla(f04, family = "binomial", Ntrials = 1, data = cc,
                       control.compute = list(config = TRUE),
                    control.predictor = list(compute = TRUE),
                 control.fixed = list(mean.intercept = 1.68,
                                      prec.intercept = 1.88^(-2),
                                    mean = list(tree = -0.03,
                                          aggresslow = 1.83),
                                   prec = list(tree  = 0.03^(-2),
                                          aggresslow = 1.6^(-2))))

# Posterior mean values and 95% CI for fixed effects
BetasFinal <- Final$summary.fixed[,c("mean", "sd", 
                                     "0.025quant", 
                                     "0.975quant")] 
round(BetasFinal, digits = 2)
```

These results can be more formally presented in the following way:

Table 7.4: **Posterior mean estimates for the probability of cuckoo eggs in the nests of great reed warblers at sites in South Moravia as a function of distance to the nearest tree (m) and parental reed warbler aggression, modelled using a Bernoulli GLM fitted with INLA. CrI are the Bayesian 95% credible intervals.**

|Model parameter      |Posterior mean|Lower 95% CrI|Upper 95% CrI|
|:--------------------|:------------:|:-----------:|:-----------:|
|Intercept            |2.27          |0.32         |4.34         |
|Tree                 |-0.07         |-0.11        |-0.03        |
|Aggression (low)     |2.11          |0.21         |4.17         |

These results (Table 7.4) show a statistically important negative effect of distance to the nearest tree and parental aggression on the probability of brood parasitism of great reed warblers by common cuckoos.

### Visualise the results

The outcome of the Bayesian Bernoulli GLM can be visualised to better appreciate the results. Coding for the figure is available in the R script associated with this chapter.

(ref:ch7-final-plot) **Posterior mean probability of brood parasitism of great reed warbler nests by the common cuckoo in South Moravia, Czech Republic as a function of distance to the nearest tree (m) and parental reed warbler aggression, modelled using a Bernoulli GLM fitted with INLA. Shaded areas are Bayesian 95% credible intervals. Black points are observed data for different reed warbler nests.**

```{r ch7-final-plot, fig.cap='(ref:ch7-final-plot)', fig.dim=c(6, 5), cache = TRUE,  message = FALSE, echo=FALSE, warning=FALSE}

MyData <- expand.grid(
  aggress = c("high", "low"),
  tree = seq(from = min(cc$tree), 
              to = max(cc$tree), length = 50))

# Make a design matrix
Xmat <- model.matrix(~ tree + aggress, data = MyData)
Xmat <- as.data.frame(Xmat)
lcb <- inla.make.lincombs(Xmat)

# Re-run the model in R-INLA using the combined data set, ensuring
# that `compute = TRUE` is selected in the `control.predictor` argument

Final.Pred <- inla(f04, family = "binomial", Ntrials = 1, data = cc,
                       lincomb = lcb,
                  control.inla = list(lincomb.derived.only = TRUE),
             control.predictor = list(compute = TRUE), 
                 control.fixed = list(mean.intercept = 1.68,
                                      prec.intercept = 1.88^(-2),
                                    mean = list(tree = -0.03,
                                          aggresslow = 1.83),
                                   prec = list(tree  = 0.03^(-2),
                                          aggresslow = 1.6^(-2))))

# Get the marginal distributions:
Pred.marg <- Final.Pred$marginals.lincomb.derived

# Results are on the logit-scale and need to be converted.
# This function converts x into exp(x) / (1 + exp(x))
MyLogit <- function(x) {exp(x)/(1+exp(x))}

# Get mu, selo and seup
MyData$mu <- unlist(lapply(Pred.marg,
             function(x) inla.emarginal(MyLogit,x)))

MyData$selo <- unlist(lapply(Pred.marg,
               function(x) inla.qmarginal(c(0.025), 
                           inla.tmarginal(MyLogit, x))))

MyData$seup <- unlist(lapply(Pred.marg,
               function(x)inla.qmarginal(c(0.975), 
                          inla.tmarginal(MyLogit, x))))

# Define labels
label_agg <- c("high" = "High aggression parents", 
                "low" = "Low aggression parents")

# Plot
ggplot() + 
  geom_jitter(data = cc, aes(y = egg, x = tree),
              shape = 19, size = 2.5, height = 0.01, 
              width = 0.01, alpha = 0.7) +
  xlab("Distance to nearest tree (m)") + 
  ylab("Posterior mean probability of parasitism") +
  xlim(-1,101) + 
  theme(text = element_text(size = 12)) + 
  theme(panel.background = element_blank()) + 
  theme(panel.border = element_rect(fill = NA, 
                                    colour = "black", size = 1)) + 
  theme(strip.background = element_rect
               (fill = "white", color = "white", size = 1)) +
  geom_line(data = MyData, aes(x = tree, y = mu), size = 1) +
  geom_ribbon(data = MyData, aes(x = tree, 
                                 ymax = seup, ymin = selo), alpha = 0.5) +
  facet_grid(. ~ aggress, scales = "fixed", space = "fixed",
                    labeller = labeller (aggress = label_agg)) +
  theme(strip.text = element_text(size = 12, face="italic")) +
  theme(strip.text = element_text(size = 12, face="italic"))
```

The results of this statistical analysis can be summarised as follows:

_A  Bayesian Bernoulli GLM was fitted to data on brood parasitism by the common cuckoo of 18 great reed warbler nests in South Moravia, Czech Republic. In the best-fitting model, which included informative priors on fixed effects derived from a pilot study, there was a statistically important negative effect of distance of nests to the nearest tree on probability of parasitism. There was also a statistically important effect of parental reed warbler aggression to a cuckoo model, with a lower probability of cuckoo brood parasitism associated with more aggressive reed warbler pairs (Table 7.4, Fig. \@ref(fig:ch7-final-plot))._

## Conclusions

Informative priors proved efficient in stabilising model estimates as a consequence of fitted probabilities that were close to zero or one. A frequentist model fitted to the data failed to converge and generated unreliable parameter estimates, as did a Bayesian Bernoulli GLM with non-informative priors. The goodness-of-fit of the Bernoulli GLM with informative priors, measured by the DIC, was also superior to the same model with non-informative priors.