26_Deprecated_exercises.Rmd

```{r, include=FALSE}
source("common.R")
```

# (PART) Miscellaneous {-} 

# Deprecated

## Conditions

__[Q1]{.Q}__: What does `options(error = recover)` do? Why might you use it?

__[A]{.solved}__: In case of `options(error = recover)` `utils::recover()` will be called (without arguments) in case of an error. This will print out a list of calls which precede the error and lets the user choose to incorporate `browser()` directly in any of the regarding environments allowing a practical mode for debugging.

__[Q2]{.Q}__: What does `options(error = quote(dump.frames(to.file = TRUE)))` do? Why might you use it?

__[A]{.solved}__: This option writes a dump of the evaluation environment where an error occurs into a file ending on `.rda`. When this option is set, R will continue to run after the first error. To stop R at the first error use `quote({dump.frames(to.file=TRUE); q()})`. These options are especially useful for debugging non-interactive R scripts afterwards ("post mortem debugging").

## Expressions (new)

1.  __[Q]{.Q}__: `base::alist()` is useful for creating pairlists to be used for function arguments:

```{r}
foo <- function() {}
formals(foo) <- alist(x = , y = 1)
foo
```

What makes `alist()` special compared to `list()`?

__[A]{.solved}__: From `?alist`:

> alist handles its arguments as if they described function arguments. So the values are not evaluated, and tagged arguments with no value are allowed whereas list simply ignores them. alist is most often used in conjunction with formals.

## Functionals

### My first functional: `lapply()`

1.  __[Q]{.Q}__: Why are the following two invocations of `lapply()` equivalent?

```{r, eval = FALSE}
trims <- c(0, 0.1, 0.2, 0.5)
x <- rcauchy(100)

lapply(trims, function(trim) mean(x, trim = trim))
lapply(trims, mean, x = x)
```

__[A]{.solved}__: In the first statement each element of `trims` is explicitly supplied to `mean()`'s second argument. In the latter statement this happens via
positional matching, since `mean()`'s first argument is supplied via name
in `lapply()`'s third argument (`...`).

2.  __[Q]{.Q}__: The function below scales a vector so it falls in the range [0, 1]. How
would you apply it to every column of a data frame? How would you apply it 
to every numeric column in a data frame?

```{r}
scale01 <- function(x) {
  rng <- range(x, na.rm = TRUE)
  (x - rng[1]) / (rng[2] - rng[1])
}
```

__[A]{.solved}__: Since this function needs numeric input, one can check this via an if clause. If one also wants to return non-numeric input columns, these can be supplied to the `else` argument of the `if()` "function":

```{r, eval = FALSE}
data.frame(lapply(mtcars, function(x) if (is.numeric(x)) scale01(x) else x))
```

3.  __[Q]{.Q}__: Use both for loops and `lapply()` to fit linear models to the
`mtcars` using the formulas stored in this list:

```{r}
formulas <- list(
  mpg ~ disp,
  mpg ~ I(1 / disp),
  mpg ~ disp + wt,
  mpg ~ I(1 / disp) + wt
)
```

__[A]{.solved}__: Like in the first exercise, we can create two `lapply()` versions:

```{r, eval = TRUE}
# lapply (2 versions)
la1 <- lapply(formulas, lm, data = mtcars)
la2 <- lapply(formulas, function(x) lm(formula = x, data = mtcars))

# for loop
lf1 <- vector("list", length(formulas))
for (i in seq_along(formulas)){
  lf1[[i]] <- lm(formulas[[i]], data = mtcars)
}
```

Note that all versions return the same content, but they won't be identical, since the values of the "call" element will differ between each version.

4.  __[Q]{.Q}__: Fit the model `mpg ~ disp` to each of the bootstrap replicates of `mtcars`
in the list below by using a for loop and `lapply()`. Can you do it 
without an anonymous function?

```{r, eval = TRUE}
bootstraps <- lapply(1:10, function(i) {
  rows <- sample(1:nrow(mtcars), rep = TRUE)
  mtcars[rows, ]
})
```

__[A]{.solved}__:

```{r, eval = TRUE}
# lapply without anonymous function
la <- lapply(bootstraps, lm, formula = mpg ~ disp)

# for loop
lf <- vector("list", length(bootstraps))
for (i in seq_along(bootstraps)){
  lf[[i]] <- lm(mpg ~ disp, data = bootstraps[[i]])
}
```

5.  __[Q]{.Q}__: For each model in the previous two exercises, extract $R^2$ using the
function below.

```{r, eval = TRUE}
rsq <- function(mod) summary(mod)$r.squared
```

__[A]{.solved}__: For the models in exercise 3:

```{r, eval = TRUE}
sapply(la1, rsq)
sapply(la2, rsq)
sapply(lf1, rsq)
```

And the models in exercise 4:

```{r, eval = TRUE}
sapply(la, rsq)
sapply(lf, rsq)
```

### For loops functionals: friends of lapply():

1.  __[Q]{.Q}__: Use `vapply()` to:

a) Compute the standard deviation of every column in a numeric data frame.

a) Compute the standard deviation of every numeric column in a mixed data
frame. (Hint: you'll need to use `vapply()` twice.)

__[A]{.solved}__: As a numeric `data.frame` we choose `cars`:

```{r, eval = FALSE}
vapply(cars, sd, numeric(1))
```

And as a mixed `data.frame` we choose `mtcars`:

```{r, eval = FALSE}
vapply(mtcars[vapply(mtcars, is.numeric, logical(1))],
       sd, 
       numeric(1))
```

2.  __[Q]{.Q}__: Why is using `sapply()` to get the `class()` of each element in
a data frame dangerous?

__[A]{.solved}__: Columns of data.frames might have more than one class, so the class of `sapply()`'s output may differ from time to time (silently). If ...

* all columns have one class: `sapply()` returns a character vector
* one column has more classes than the others: `sapply()` returns a list
* all columns have the same number of classes, which is more than one: `sapply()` returns a matrix

For example:

```{r}
a <- letters[1:3]
class(a) <- c("class1", "class2")
df <- data.frame(a = character(3))
df$a <- a
df$b <- a
class(sapply(df, class))
```

Note that this case often appears, wile working with the POSIXt types, POSIXct and POSIXlt.

3.  __[Q]{.Q}__: The following code simulates the performance of a t-test for non-normal
data. Use `sapply()` and an anonymous function to extract the p-value from 
every trial.

```{r}
trials <- replicate(
  100, 
  t.test(rpois(10, 10), rpois(7, 10)),
  simplify = FALSE
)
```

Extra challenge: get rid of the anonymous function by using `[[` directly.

__[A]{.solved}__:

```{r, eval = FALSE}
# anonymous function:
sapply(trials, function(x) x[["p.value"]])
# without anonymous function:
sapply(trials, "[[", "p.value")
```

4.  __[Q]{.Q}__: What does `replicate()` do? What sort of for loop does it eliminate? Why
do its arguments differ from `lapply()` and friends?

__[A]{.solved}__: As stated in `?replicate`:

> replicate is a wrapper for the common use of sapply for repeated evaluation of an expression (which will usually involve random number generation).

We can see this clearly in the source code:

```{r, echo = FALSE}
replicate
```

Like `sapply()` `replicate()` eliminates a for loop. As explained for `Map()` in the textbook, also every `replicate()` could have been written via `lapply()`. But using `replicate()` is more concise, and more clearly indicates what you're trying to do.

5.  __[Q]{.Q}__: Implement a version of `lapply()` that supplies `FUN` with both the name
and the value of each component.

__[A]{.solved}__:

```{r, eval = TRUE}
lapply_nms <- function(X, FUN, ...){
  Map(FUN, X, names(X), ...)
}
lapply_nms(mtcars, function(x, y) c(class(x), y))
```

6.  __[Q]{.Q}__: Implement a combination of `Map()` and `vapply()` to create an `lapply()`
variant that iterates in parallel over all of its inputs and stores its 
outputs in a vector (or a matrix). What arguments should the function 
take?

__[A]{.solved}__ As we understand this exercise, it is about working with a list of lists, like in the following example:

```{r}
testlist <- list(mtcars, mtcars, cars)
lapply(testlist, function(x) vapply(x, mean, numeric(1)))
```

So we can get the same result with a more specialized function:

````{r}
lmapply <- function(X, FUN, FUN.VALUE, simplify = FALSE){
  out <- Map(function(x) vapply(x, FUN, FUN.VALUE), X)
  if(simplify == TRUE){return(simplify2array(out))}
  out
}

lmapply(testlist, mean, numeric(1))
```

7.  __[Q]{.Q}__: Implement `mcsapply()`, a multi-core version of `sapply()`. Can you
implement `mcvapply()`, a parallel version of `vapply()`? Why or why not?

### Manipulating matrices and data frames

1.  __[Q]{.Q}__: How does `apply()` arrange the output? Read the documentation and perform
some experiments.

__[A]{.solved}__:

`apply()` arranges its output columns (or list elements) according to the order of the margin.
The rows are ordered by the other dimensions, starting with the "last" dimension
of the input object. What this means should become clear by looking at the three and four dimensional cases of the following example:

```{r, eval = FALSE}
# for two dimensional cases everything is sorted by the other dimension
arr2 <- array(1:9, dim = c(3, 3), dimnames = list(paste0("row", 1:3),
                                                  paste0("col", 1:3)))
arr2
apply(arr2, 1, head, 1) # Margin is row
apply(arr2, 1, head, 9) # sorts by col

apply(arr2, 2, head, 1) # Margin is col
apply(arr2, 2, head, 9) # sorts by row

# 3 dimensional
arr3 <- array(1:27, dim = c(3,3,3), dimnames = list(paste0("row", 1:3),
                                                    paste0("col", 1:3),
                                                    paste0("time", 1:3)))
arr3
apply(arr3, 1, head, 1) # Margin is row
apply(arr3, 1, head, 27) # sorts by time and col

apply(arr3, 2, head, 1) # Margin is col
apply(arr3, 2, head, 27) # sorts by time and row

apply(arr3, 3, head, 1) # Margin is time
apply(arr3, 3, head, 27) # sorts by col and row

# 4 dimensional
arr4 <- array(1:81, dim = c(3,3,3,3), dimnames = list(paste0("row", 1:3),
                                                      paste0("col", 1:3),
                                                      paste0("time", 1:3),
                                                      paste0("var", 1:3)))
arr4

apply(arr4, 1, head, 1) # Margin is row
apply(arr4, 1, head, 81) # sorts by var, time, col

apply(arr4, 2, head, 1) # Margin is col
apply(arr4, 2, head, 81) # sorts by var, time, row

apply(arr4, 3, head, 1) # Margin is time
apply(arr4, 3, head, 81) # sorts by var, col, row

apply(arr4, 4, head, 1) # Margin is var
apply(arr4, 4, head, 81) # sorts by time, col, row
```

2.  __[Q]{.Q}__: There's no equivalent to `split()` + `vapply()`. Should there be? When
would it be useful? Implement one yourself.

__[A]{.solved}__: We can modify the `tapply2()` approach from the book, where `split()` and `sapply()` were combined:

```{r, eval = FALSE}
v_tapply <- function(x, group, f, FUN.VALUE, ..., USE.NAMES = TRUE) {
  pieces <- split(x, group)
  vapply(pieces, f, FUN.VALUE, ..., USE.NAMES = TRUE)
}
```

`tapply()` has a `SIMPLIFY` argument. When you set it to `FALSE`, `tapply()` will always return a list. It is easy to create cases where the length and the types/classes of the list elements vary depending on the input. The `vapply()` version could be useful, if you want to control the structure of the output to get an error according to some logic of a specific usecase or you want typestable output to build up other functions on top of it.

3.  __[Q]{.Q}__: Implement a pure R version of `split()`. (Hint: use `unique()` and
subsetting.) Can you do it without a for loop?

__[A]{.solved}__:

```{r, eval = FALSE}
split2 <- function(x, f, drop = FALSE, ...){
  # there are three relevant cases for f. f is a character, f is a factor and all
  # levels occur, f is a factor and some levels don't occur.
  
  # first we check if f is a factor
  fact <- is.factor(f)
  
  # if drop it set to TRUE, we drop the non occuring levels.
  # (If f is a character, this has no effect.)
  if(drop){f <- f[, drop = TRUE]}
  
  # now we want all unique elements/levels of f
  levs <- if (fact) {unique(levels(f))} else {as.character(unique(f))}
  
  # we use these levels to subset x and supply names for the resulting output.
  setNames(lapply(levs, function(lv) x[f == lv, , drop = FALSE]), levs)
}
```

4.  __[Q]{.Q}__: What other types of input and output are missing? Brainstorm before you look up some answers in the [plyr paper](http://www.jstatsoft.org/v40/i01/).

__[A]{.solved}__: From the suggested plyr paper, we can extract a lot of possible combinations and list them up on a table. Sean C. Anderson already has done this based on a presentation from Hadley Wickham and provided the following result [here](http://seananderson.ca/2013/12/01/plyr.html).

| object type        | array       | data frame   | list        | nothing   |
|--------------------|-------------|--------------|-------------|-----------|
| array              | `apply`     | `.`          | `.`         | `.`       |
| data frame         | `.`         | `aggregate`  | `by`        | `.`       |
| list               | `sapply`    | `.`          | `lapply`    | `.`       |
| n replicates       | `replicate` | `.`          | `replicate` | `.`       |
| function arguments | `mapply`    | `.`          | `mapply`    | `.`       |

Note the column nothing, which is specifically for usecases, where sideeffects like plotting or writing data are intended.

### Manipulating lists

1.  __[Q]{.Q}__: Why isn't `is.na()` a predicate function? What base R function is closest
to being a predicate version of `is.na()`?

__[A]{.solved}__: Because a predicate function always returns `TRUE` or `FALSE`. `is.na(NULL)` returns `logical(0)`, which excludes it from being a predicate function. The closest in base that we are aware of is `anyNA()`, if one applies it elementwise.

2.  __[Q]{.Q}__: Use `Filter()` and `vapply()` to create a function that applies a summary
statistic to every numeric column in a data frame.

__[A]{.solved}__:

```{r, eval = FALSE}
vapply_num <- function(X, FUN, FUN.VALUE){
  vapply(Filter(is.numeric, X), FUN, FUN.VALUE)
}
```

3.  __[Q]{.Q}__: What's the relationship between `which()` and `Position()`? What's
the relationship between `where()` and `Filter()`?

__[A]{.solved}__: `which()` returns all indices of true entries from a logical vector. `Position()` returns just the first (default) or the last integer index of all true entries that occur by applying a predicate function on a vector. So the default relation is `Position(f, x) <=> min(which(f(x)))`.

`where()`, defined in the book as:

```{r, eval = FALSE}
where <- function(f, x) {
  vapply(x, f, logical(1))
} 
```

is useful to return a logical vector from a condition asked on elements of a list or a data frame. `Filter(f, x)` returns all elements of a list or a data frame, where
the supplied predicate function returns `TRUE`. So the relation is
`Filter(f, x) <=> x[where(f, x)]`.

4.  __[Q]{.Q}__: Implement `Any()`, a function that takes a list and a predicate function,
and returns `TRUE` if the predicate function returns `TRUE` for any of 
the inputs. Implement `All()` similarly.

__[A]{.solved}__: `Any()`:

```{r, eval = FALSE}
Any <- function(l, pred){
  stopifnot(is.list(l))
  
  for (i in seq_along(l)){
    if (pred(l[[i]])) return(TRUE)
  }
  
  return(FALSE)
}
```

`All()`:

```{r, eval = FALSE}
All <- function(l, pred){
  stopifnot(is.list(l))
  
  for (i in seq_along(l)){
    if (!pred(l[[i]])) return(FALSE)
  }
  
  return(TRUE)
}
```

5.  __[Q]{.Q}__: Implement the `span()` function from Haskell: given a list `x` and a
predicate function `f`, `span` returns the location of the longest 
sequential run of elements where the predicate is true. (Hint: you 
might find `rle()` helpful.)

__[A]{.solved}__: Our `span_r()` function returns the first index of     the longest sequential run of elements where the predicate is true. In case of more than one longest sequenital, more than one first_index is returned.

```{r, eval = FALSE}
span_r <- function(l, pred){
  # We test if l is a list
  stopifnot(is.list(l))
  
  # we preallocate a logical vector and save the result
  # of the predicate function applied to each element of the list
  test <- vector("logical", length(l))
  for (i in seq_along(l)){
    test[i] <- (pred(l[[i]]))
  }
  # we return NA, if the output of pred is always FALSE
  if(!any(test)) return(NA_integer_)
  
  # Otherwise we look at the length encoding of TRUE and FALSE values.
  rle_test <- rle(test)
  # Since it might happen, that more than one maximum series of TRUE's appears,
  # we have to implement some logic, which might be easier, if we save the rle 
  # output in a data.frmame
  rle_test <- data.frame(lengths = rle_test[["lengths"]],
                         values = rle_test[["values"]],
                         cumsum = cumsum(rle_test[["lengths"]]))
  rle_test[["first_index"]] <- rle_test[["cumsum"]] - rle_test[["lengths"]] + 1
  # In the last line we calculated the first index in the original list for every encoding
  # In the next line we calculate a column, which gives the maximum 
  # encoding length among all encodings with the value TRUE
  rle_test[["max"]] <-  max(rle_test[rle_test[, "values"] == TRUE, ][,"lengths"])
  # Now we just have to subset for maximum length among all TRUE values and return the
  # according "first index":
  rle_test[rle_test$lengths == rle_test$max & rle_test$values == TRUE, ]$first_index
}
```

### List of functions

1.  __[Q]{.Q}__: Implement a summary function that works like `base::summary()`, but uses a
list of functions. Modify the function so it returns a closure, making it 
possible to use it as a function factory.

1.  __[Q]{.Q}__: Which of the following commands is equivalent to `with(x, f(z))`?

(a) `x$f(x$z)`.
(b) `f(x$z)`.
(c) `x$f(z)`.
(d) `f(z)`.
(e) It depends.

### Mathematical functionals

1.  __[Q]{.Q}__: Implement `arg_max()`. It should take a function and a vector of inputs,
and return the elements of the input where the function returns the highest 
value. For example, `arg_max(-10:5, function(x) x ^ 2)` should return -10.
`arg_max(-5:5, function(x) x ^ 2)` should return `c(-5, 5)`.
Also implement the matching `arg_min()` function.

__[A]{.solved}__: `arg_max()`:

```{r, eval = FALSE}
arg_max <- function(x, f){
  x[f(x) == max(f(x))]
}
```

`arg_min()`:

```{r, eval = FALSE}
arg_min <- function(x, f){
  x[f(x) == min(f(x))]
}
```

2.  __[Q]{.Q}__: Challenge: read about the
[fixed point algorithm](https://mitpress.mit.edu/sites/default/files/sicp/full-text/book/book-Z-H-12.html#%25_idx_1096). 
Complete the exercises using R.

### A family of functions

1.  __[Q]{.Q}__: Implement `smaller` and `larger` functions that, given two inputs, return
either the smaller or the larger value. Implement `na.rm = TRUE`: what 
should the identity be? (Hint: 
`smaller(x, smaller(NA, NA, na.rm = TRUE), na.rm = TRUE)` must be `x`, so 
`smaller(NA, NA, na.rm = TRUE)` must be bigger than any other value of x.) 
Use `smaller` and `larger` to implement equivalents of `min()`, `max()`,
`pmin()`, `pmax()`, and new functions `row_min()` and `row_max()`.

__[A]{.solved}__: We can do almost everything as shown in the case study in the textbook. First we define the functions `smaller_()` and `larger_()`. We use the underscore suffix, to built up non suffixed versions on top, which will include the `na.rm` parameter. In contrast to the `add()` example from the book, we change two things at this step. We won't include errorchecking, since this is done later at the top level and we return `NA_integer_` if any of the arguments is `NA` (this is important, if na.rm is set to `FALSE` and wasn't needed by the `add()` example, since `+` already returns `NA` in this case.)

```{r}
smaller_ <- function(x, y){
  if(anyNA(c(x, y))){return(NA_integer_)}
  out <- x
  if(y < x) {out <- y}
  out
}

larger_ <- function(x, y){
  if(anyNA(c(x, y))){return(NA_integer_)}
  out <- x
  if(y > x) {out <- y}
  out
}
```

We can take `na.rm()` from the book:

```{r}
rm_na <- function(x, y, identity) {
  if (is.na(x) && is.na(y)) {
    identity
  } else if (is.na(x)) {
    y
  } else {
    x
  }
}
```

To find the identity value, we can apply the same argument as in the textbook, hence our functions are also associative and the following equation should hold:

```
3 = smaller(smaller(3, NA), NA) = smaller(3, smaller(NA, NA)) = 3
```

So the identidy has to be greater than 3. When we generalize from 3 to any real number this means that the identity has to be greater than any number, which leads us to infinity. Hence identity has to be `Inf` for `smaller()` (and `-Inf` for `larger()`), which we implement next:

```{r}
smaller <- function(x, y, na.rm = FALSE) {
  stopifnot(length(x) == 1, length(y) == 1, is.numeric(x) | is.logical(x),
            is.numeric(y) | is.logical(y))
  if (na.rm && (is.na(x) || is.na(y))) rm_na(x, y, Inf) else smaller_(x,y)
}

larger <- function(x, y, na.rm = FALSE) {
  stopifnot(length(x) == 1, length(y) == 1, is.numeric(x) | is.logical(x),
            is.numeric(y) | is.logical(y))
  if (na.rm && (is.na(x) || is.na(y))) rm_na(x, y, -Inf) else larger_(x,y)
}
```

Like `min()` and `max()` can act on vectors, we can implement this easyly for our new functions. As shown in the book, we also have to set the `init` parameter to the identity value.

```{r}
r_smaller <- function(xs, na.rm = TRUE) {
  Reduce(function(x, y) smaller(x, y, na.rm = na.rm), xs, init = Inf)
}
# some tests
r_smaller(c(1:3, 4:(-1)))
r_smaller(NA, na.rm = TRUE)
r_smaller(numeric())

r_larger <- function(xs, na.rm = TRUE) {
  Reduce(function(x, y) larger(x, y, na.rm = na.rm), xs, init = -Inf)
}
# some tests
r_larger(c(1:3), c(4:1))
r_larger(NA, na.rm = TRUE)
r_larger(numeric())
```

We can also create vectorised versions as shown in the book. We will just show the `smaller()` case to become not too verbose.

```{r}
v_smaller1 <- function(x, y, na.rm = FALSE){
  stopifnot(length(x) == length(y), is.numeric(x) | is.logical(x), 
            is.numeric(y)| is.logical(x))
  if (length(x) == 0) return(numeric())
  simplify2array(
    Map(function(x, y) smaller(x, y, na.rm = na.rm), x, y)
  )
}

v_smaller2 <- function(x, y, na.rm = FALSE) {
  stopifnot(length(x) == length(y), is.numeric(x) | is.logical(x), 
            is.numeric(y)| is.logical(x))
  vapply(seq_along(x), function(i) smaller(x[i], y[i], na.rm = na.rm),
         numeric(1))
}

# Both versions give the same results
v_smaller1(1:10, c(2,1,4,3,6,5,8,7,10,9))
v_smaller2(1:10, c(2,1,4,3,6,5,8,7,10,9))

v_smaller1(numeric(), numeric())
v_smaller2(numeric(), numeric())

v_smaller1(c(1, NA), c(1, NA), na.rm = FALSE)
v_smaller2(c(1, NA), c(1, NA), na.rm = FALSE)

v_smaller1(NA,NA)
v_smaller2(NA,NA)
```

Of course, we are also able to copy paste the rest from the textbook, to solve the last part of the exercise:

```{r}
row_min <- function(x, na.rm = FALSE) {
  apply(x, 1, r_smaller, na.rm = na.rm)
}
col_min <- function(x, na.rm = FALSE) {
  apply(x, 2, r_smaller, na.rm = na.rm)
}
arr_min <- function(x, dim, na.rm = FALSE) {
  apply(x, dim, r_smaller, na.rm = na.rm)
}
```

2.  __[Q]{.Q}__: Create a table that has _and_, _or_, _add_, _multiply_, _smaller_, and
_larger_ in the columns and _binary operator_, _reducing variant_, 
_vectorised variant_, and _array variants_ in the rows.

a) Fill in the cells with the names of base R functions that perform each of
the roles.

a) Compare the names and arguments of the existing R functions. How
consistent are they? How could you improve them?

a) Complete the matrix by implementing any missing functions.

__[A]{.solved}__ In the following table we can see the requested base R functions, that we are aware of:

|            | and      | or       | add      | multiply | smaller  | larger   |
|------------|----------|----------|----------|----------|----------|----------|
| binary     | `&&`     | `||`     |          |          |          |          |
| reducing   | `all`    | `any`    | `sum`    | `prod`   | `min`    | `max`    |
| vectorised | `&`      | `|`      | `+`      | `*`      | `pmin`   | `pmax`   |
| array      |          |          |          |          |          |          |

Notice that we were relatively strict about the _binary_ row. Since the _vectorised_ and _reducing_ versions are more general, then the _binary_ versions, we could have used them twice. However, this doesn't seem to be the intention of this exercise. 

The last part of this exercise can be solved via copy pasting from the book and the last exercise for the _binary_ row and creating combinations of `apply()` and the _reducing_ versions for the _array_ row. We think the array functions just need a dimension and an `rm.na` argument. We don't know how we would name them, but sth. like `sum_array(1, na.rm = TRUE)` could be ok.

The second part of the exercise is hard to solve complete. But in our opinion, there are two important parts. The behaviour for special inputs like `NA`, `NaN`, `NULL` and zero length atomics should be consistent and all versions should have a `rm.na` argument, for which the functions also behave consistent. In the follwing table, we return the output of `` `f`(x, 1) ``, where `f` is the function in the first column  and `x` is the special input in the header (the named functions also have an `rm.na` argument, which is `FALSE` by default). The order of the arguments is important, because of lazy evaluation.

|         | `NA`     | `NaN`    | `NULL`       | `logical(0)` | `integer(0)` |
|---------|----------|----------|--------------|--------------|--------------|
| `&&`    | `NA`     | `NA`     | `error`      | `NA`         | `NA`         |
| `all`   | `NA`     | `NA`     | `TRUE`       | `TRUE`       | `TRUE`       |
| `&`     | `NA`     | `NA`     | `error`      | `logical(0)` | `logical(0)` |
| `||`    | `TRUE`   | `TRUE`   | `error`      | `TRUE`       | `TRUE`       |
| `any`   | `TRUE`   | `TRUE`   | `TRUE`       | `TRUE`       | `TRUE`       |
| `|`     | `TRUE`   | `TRUE`   | `error`      | `logical(0)` | `logical(0)` |
| `sum`   | `NA`     | `NaN`    | `1`          | `1`          | `1`          |
| `+`     | `NA`     | `NaN`    | `numeric(0)` | `numeric(0)` | `numeric(0)` |
| `prod`  | `NA`     | `NaN`    | `1`          | `1`          | `1`          |
| `*`     | `NA`     | `NaN`    | `numeric(0)` | `numeric(0)` | `numeric(0)` |
| `min`   | `NA`     | `NaN`    | `1`          | `1`          | `1`          |
| `pmin`  | `NA`     | `NaN`    | `numeric(0)` | `numeric(0)` | `numeric(0)` |
| `max`   | `NA`     | `NaN`    | `1`          | `1`          | `1`          |
| `pmax`  | `NA`     | `NaN`    | `numeric(0)` | `numeric(0)` | `numeric(0)` |

We can see, that the vectorised and reduced numerical functions are all consistent. However it is not, that the first three logical functions return `NA` for `NA` and `NaN`, while the 4th till 6th function all return `TRUE`. Then `FALSE` would be more consistent for the first three or the return of `NA` for all and an extra `na.rm` argument. In seems relatively hard to find an easy rule for all cases and especially the different behaviour for `NULL` is relatively confusing. Another good opportunity for sorting the functions would be to differentiate between "numerical" and "logical" operators first and then between binary, reduced and vectorised, like below (we left the last colum, which is redundant, because of coercion, as intended):

| `` `f(x,1)` `` | `NA`     | `NaN`    | `NULL`       | `logical(0)` |
|----------------|----------|----------|--------------|--------------|
|    `&&`        | `NA`     | `NA`     | error        | `NA`         |
|    `||`        | `TRUE`   | `TRUE`   | error        | `TRUE`       |
|    `all`       | `NA`     | `NA`     | `TRUE`       | `TRUE`       |
|    `any`       | `TRUE`   | `TRUE`   | `TRUE`       | `TRUE`       |
|    `&`         | `NA`     | `NA`     | error        | `logical(0)` |
|    `|`         | `TRUE`   | `TRUE`   | error        | `logical(0)` |
|    `sum`       | `NA`     | `NaN`    | 1            | 1            |
|    `prod`      | `NA`     | `NaN`    | 1            | 1            |
|    `min`       | `NA`     | `NaN`    | 1            | 1            |
|    `max`       | `NA`     | `NaN`    | 1            | 1            |
|    `+`         | `NA`     | `NaN`    | `numeric(0)` | `numeric(0)` |
|    `*`         | `NA`     | `NaN`    | `numeric(0)` | `numeric(0)` |
|    `pmin`      | `NA`     | `NaN`    | `numeric(0)` | `numeric(0)` |
|    `pmax`      | `NA`     | `NaN`    | `numeric(0)` | `numeric(0)` |

The other point are the naming conventions. We think they are clear, but it could be useful to provide the missing binary operators and name them for example `++`, `**`, `<>`, `><` to be consistent.

3.  __[Q]{.Q}__: How does `paste()` fit into this structure? What is the scalar binary
function that underlies `paste()`? What are the `sep` and `collapse` 
arguments to `paste()` equivalent to? Are there any `paste` variants 
that don't have existing R implementations?

__[A]{.solved}__ `paste()` behaves like a mix. If you supply only length one arguments, it will behave like a reducing function, i.e. :

```{r}
paste("a", "b", sep = "")
paste("a", "b","", sep = "") 
```

If you supply at least one element with length greater then one, it behaves like a vectorised function, i.e. :

```{r}
paste(1:3)
paste(1:3, 1:2)
paste(1:3, 1:2, 1)
```

We think it should be possible to implement a new `paste()` starting from

```{r}
p_binary <- function(x, y = "") {
  stopifnot(length(x) == 1, length(y) == 1)
  paste0(x,y)
}
```


The `sep` argument is equivalent to bind `sep` on every `...` input supplied to `paste()`, but the last and then bind these results together. In relations:

```
paste(n1, n2, ...,nm , sep = sep) <=>
paste0(paste0(n1, sep), paste(n2, n3, ..., nm, sep = sep)) <=>
paste0(paste0(n1, sep), paste0(n2, sep), ..., paste0(nn, sep), paste0(nm))
```
We can check this for scalar and non scalar input

```{r}
# scalar:
paste("a", "b", "c", sep = "_")
paste0(paste0("a", "_"), paste("b", "c", sep = "_"))
paste0(paste0("a", "_"), paste0("b", "_"), paste0("c"))

# non scalar
paste(1:2, "b", "c", sep = "_")
paste0(paste0(1:2, "_"), paste("b", "c", sep = "_"))
paste0(paste0(1:2, "_"), paste0("b", "_"), paste0("c"))
```


collapse just binds the outputs for non scalar input together with the collapse input.
In relations:

```
for input A1, ..., An, where Ai = a1i:ami,

paste(A1 , A2 , ...,  An, collapse = collapse) 
<=>
paste0(
paste0(paste(  a11,   a12, ...,   a1n), collapse),
paste0(paste(  a21,   a22, ...,   a2n), collapse),
.................................................
paste0(paste(am-11, am-12, ..., am-1n), collapse),      
paste(  am1,   am2, ...,   amn)
)
```

One can see this easily by intuition from examples:

```{r}
paste(1:5, 1:5, 6, sep = "", collapse = "_x_")
paste(1,2,3,4, collapse = "_x_")
paste(1:2,1:2,2:3,3:4, collapse = "_x_")
```

We think the only paste version that is not implemented in base R is an array version.
At least we are not aware of sth. like `row_paste` or `paste_apply` etc.

## Function Factories

### Closures

__[Q1]{.Q}__: Why are functions created by other functions called closures?

__[A]{.solved}__: As stated in the book:

> because they enclose the environment of the parent function and can access all its variables.

__[Q2]{.Q}__: What does the following statistical function do? What would be a better
name for it? (The existing name is a bit of a hint.)

```{r}
bc <- function(lambda) {
  if (lambda == 0) {
    function(x) log(x)
  } else {
    function(x) (x ^ lambda - 1) / lambda
  }
}
```

__[A]{.solved}__: It is the logarithm, when lambda equals zero and `x ^ lambda - 1 / lambda` otherwise. A better name might be `box_cox_transformation` (one parametric), you can read about it (here)[https://en.wikipedia.org/wiki/Power_transform].

__[Q3]{.Q}__: What does `approxfun()` do? What does it return?

__[A]{.solved}__: `approxfun` basically takes a combination of 2-dimensional data points + some extra specifications as arguments and returns a stepwise linear or constant interpolation function (defined on the range of given x-values, by default).

__[Q4]{.Q}__: What does `ecdf()` do? What does it return?

__[A]{.solved}__: "ecdf" means empirical density function. For a numeric vector, `ecdf()` returns the appropriate density function (of class `ecdf`, which is inheriting from class `stepfun`). You can describe it's behaviour in 2 steps. In the first part of it's body, the `(x,y)` pairs for the nodes of the density function are calculated. In the second part these pairs are given to `approxfun`.

__[Q5]{.Q}__: Create a function that creates functions that compute the ith [central moment](http://en.wikipedia.org/wiki/Central_moment) of a numeric vector. You can test it by running the following code:

```{r, eval = FALSE}
m1 <- moment(1)
m2 <- moment(2)

x <- runif(100)
stopifnot(all.equal(m1(x), 0))
stopifnot(all.equal(m2(x), var(x) * 99 / 100))
```

__[A]{.solved}__: For a discrete formulation look [here](http://www.r-tutor.com/elementary-statistics/numerical-measures/moment)

```{r, eval = FALSE}
moment <- function(i){
  function(x) sum((x - mean(x)) ^ i) / length(x)
}
```

__[Q6]{.Q}__: Create a function `pick()` that takes an index, `i`, as an argument and returns a function with an argument `x` that subsets `x` with `i`.

```{r, eval = FALSE}
lapply(mtcars, pick(5))
# should do the same as this
lapply(mtcars, function(x) x[[5]])
```

__[A]{.solved}__:

```{r, eval = FALSE}
pick <- function(i){
  function(x) x[[i]]
}

stopifnot(identical(lapply(mtcars, pick(5)),
                    lapply(mtcars, function(x) x[[5]]))
)
```

### Case study: numerical integration

__[Q1]{.Q}__: Instead of creating individual functions (e.g. `midpoint()`,
`trapezoid()`, `simpson()`, etc.), we could store them in a list. If we 
did that, how would that change the code? Can you create the list of 
functions from a list of coefficients for the Newton-Cotes formulae?  
__[A]{.solved}__:

__[Q2]{.Q}__: The trade-off between integration rules is that more complex rules are
slower to compute, but need fewer pieces. For `sin()` in the range 
[0, $\pi$], determine the number of pieces needed so that each rule will 
be equally accurate. Illustrate your results with a graph. How do they
change for different functions? `sin(1 / x^2)` is particularly challenging.  
__[A]{.solved}__:

## S3

__[Q1]{.Q}__: The most important S3 objects in base R are factors, data frames, difftimes, and date/times (Dates, POSIXct, POSIXlt). You've already seen the attributes and base type that factors are built on. What base types and attributes are the others built on?

__[A]{.started}__: TODO: Add answer for difftime.

**data frame:** Data frames are build up on (named) lists. Together with the `row.names` attribute and after setting the class to "data.frame", we get a classical data frame

```{r}
df_build <- structure(list(1:2, 3:4),
                      names = c("a", "b"),
                      row.names = 1:2, 
                      class = "data.frame")

df_classic <- data.frame(a = 1:2, b = 3:4)

identical(df_build, df_classic)
```

**date/times (Dates, POSIXct, POSIXlt):** Date is just a double with the class attribute set to "Date"

```{r}
date_build <- structure(0, class = "Date")
date_classic <- as.Date("1970-01-01")
identical(date_build, date_classic)
```

POSIXct is a class for date/times that inherits from POSIXt and is built on doubles as well. The only attribute is tz (for timezone)

```{r}
POSIXct_build <- structure(1, class = c("POSIXct", "POSIXt"), tzone = "CET")
POSIXct_classic <- .POSIXct(1, tz = "CET") # note that tz's default is NULL
identical(POSIXct_build, POSIXct_classic)
```

POSIXlt is another date/time class that inherits from POSIXt. It is built on top of a named list and a tzone attribute. Differences between POSIXct and POSIXlt are described in `?DateTimeClasses`.

```{r}
POSIXlt_build <- structure(list(sec = 30,
                                min = 30L,
                                hour = 14L,
                                mday = 1L,
                                mon = 0L,
                                year = 70L,
                                wday = 4L,
                                yday = 0L,
                                isdst = 0L,
                                zone = "CET",
                                gmtoff = 3600L),
                           tzone = c("", "CET", "CEST"),
                           class = c("POSIXlt", "POSIXt"))
POSIXlt_classic <- as.POSIXlt(.POSIXct(13.5 * 3600 + 30))
identical(POSIXlt_build, POSIXlt_classic)
```

1.  __[Q]{.Q}__: Draw a Venn diagram illustrating the relationships between functions, generics, and methods.

__[A]{.started}__: Funtions don't have to be generics or methods, but both the latter are functions. It is also possible that a function is both, a method and a generic, at the same time, which seems to be relatively awkward and is definitely not recommended; see also `?pryr::ftype`:

> This function figures out whether the input function is a regular/primitive/internal function, a internal/S3/S4 generic, or a S3/S4/RC method. This is function is slightly simplified as it's possible for a method from one class to be a generic for another class, but that seems like such a bad idea that hopefully no one has done it.

2.  __[Q]{.Q}__: Write a constructor for `difftime` objects. What base type are they built on? What attributes do they use? You'll need to consult the documentation, read some code, and perform some experiments.

__[A]{.solved}__: Our constructor should be named `new_class_name`, have one argument for its base type and each attribute and check the base types of these arguments as well.

```{r}
new_difftime <- function(x, units = "auto") {
  stopifnot(is.double(x), is.character(units))
  
  structure(x, units = units, class = "difftime")
}
```

However, since the following result prints awkward

```{r}
new_difftime(3)
```

we get a little bit more "inspiration" by the original `difftime()` function and make the regarding changes. Basically we need to implement logic for the units attribute, in case it is set to `"auto"` and convert the value of the underlying double from seconds to the regarding unit, as commented in the following

```{r}
new_difftime <- function(x, units = "auto") {
  stopifnot(is.double(x), is.character(units))
  
  # case units == "auto":
  if (units == "auto") 
    # when all time differences are NA, units should be "secs"
    units <- if (all(is.na(x))){
      "secs"
    } else {
      # otherwise set the units regarding to the minimal time difference
      x_min <- min(abs(x), na.rm = TRUE)
      if (!is.finite(x_min) || x_min < 60) {
        "secs"
      } else if (x_min < 3600) {
        "mins"
      } else if (x_min < 86400) {
        "hours"
      } else {
        "days"
      }
    }
  
  # we rescale the underlying double, according to the units
  x <- switch(units, 
              secs = x,
              mins = x/60,
              hours = x/3600,
              days = x/86400,
              weeks = x/(7 * 86400))
  
  structure(x, units = units, class = "difftime")
}

# test
new_difftime(c(NA, -3600, 86400))
```

### Inheritance
__[Q1]{.Q}__: The `ordered` class is a subclass of `factor`, but it's implemented in a very ad hoc way in base R. Implement it in a principled way by building a constructor and providing `vec_restore()` method.

```{r}
f1 <- factor("a", c("a", "b"))
as.factor(f1)  
as.ordered(f1) # loses levels
```

__[A]{.started}__: TODO: the olad exercise text ended on "an `as_ordered` generic". Check the answer if it needs to be updated.

ordered is a subclass of factor, so we need to do the following

* for factors: add a subclass argument to the constructor and helper
* for ordered: add a constructor
* write an `as_ordered()` generic with methods ordered, factor and default

We use the **factor** constructor from the textbook and add the subclass argument

```{r}
new_factor <- function(x, levels, ..., subclass = NULL) {
  stopifnot(is.integer(x))
  stopifnot(is.character(levels))
  
  structure(
    x,
    levels = levels,
    class = c(subclass, "factor")
  )
}
```

We also use the validator for factors from the textbook

```{r}
validate_factor <- function(x) {
  values <- unclass(x)
  levels <- attr(x, "levels")
  
  if (!all(!is.na(values) & values > 0)) {
    stop(
      "All `x` values must be non-missing and greater than zero",
      call. = FALSE
    )
  }
  
  if (length(levels) < max(values)) {
    stop(
      "There must at least as many `levels` as possible values in `x`",
      call. = FALSE
    )
  }
  
  x
}
```

And we add the subclass argument for the helper from the textbook and the exercises

```{r}
factor <- function(x, levels = unique(x), ... , subclass = NULL) {
  ind <- match(x, levels)
  
  # error when values occur, which are not in the levels
  if(any(is.na(ind))){
    stop("The following values do not occur in the levels: ",
         paste(setdiff(x,levels), collapse = ", "), ".", 
         call. = FALSE)
  }
  
  validate_factor(new_factor(ind, levels, subclass = subclass))
}
```

A constructor for ordered is already implemented in the sloop package:

```{r}
new_ordered <- function (x, levels) {
  stopifnot(is.integer(x))
  stopifnot(is.character(levels))
  structure(x, levels = levels, class = c("ordered", "factor"))
}
```

The implementation of the **generic** and the first two methods is straight forward

```{r}
as_ordered <- function(x, ...) {
  UseMethod("as_ordered")
}

as_ordered.ordered <- function(x, ...) x
as_ordered.default <- function(x, ...) {
  stop(
    "Don't know how to coerce object of class ", 
    paste(class(x), collapse = "/"), " into an ordered factor", 
    call. = FALSE
  )
}
```

For the factor method of `as_ordered()` we use the factor helper, since it saves us some typing:

```{r}
as_ordered.factor <- function(x, ...) {
  factor(x, attr(x, "levels"), subclass = "ordered")
}
```

Finally, our new method preserves all levels:

```{r}
as_ordered(f1)
```

For a real scenario, we might want to add an `as_factor.ordered()` method to the `as_factor()` generic from the textbook.

## S4

### Generics and methods
__[Q2]{.Q}__: What's the difference between the generics generated by these two calls?

```{r, eval = FALSE}
setGeneric("myGeneric", function(x) standardGeneric("myGeneric"))
setGeneric("myGeneric", function(x) {
  standardGeneric("myGeneric")
})
```

__[A]{.solved}__: The first call defines a standard generic and the second one creates a nonstandard generic. One can confirm this directly whlie printing (showing in S4 jargon) the function.

```{r, eval = TRUE}
setGeneric("myGeneric", function(x) standardGeneric("myGeneric"))
myGeneric

setGeneric("myGeneric", function(x) {
  standardGeneric("myGeneric")
})
myGeneric
```

## Expressions

### Abstract syntax trees

__[Q1]{.Q}__: Use `ast()` and experimentation to figure out the three arguments to an `if()` call. What would you call them? Which arguments are required and which are optional?

__[A]{.solved}__: You can write an `if()` statement in several ways: with or without `else`, formatted or in one line and also in prefix notation. Here are several versions focussing on the possibility of leaving out curly brackets.

```{r}
lobstr::ast(if (TRUE) {} else {})
lobstr::ast(if (TRUE) 1 else 2)
lobstr::ast(`if`(TRUE, 1, 2))
```

One possible way of naming the arguments would be: condition (1), conclusion (2), alternative (3).

The *condition* is always required. If the *condition* is `TRUE`, also the *conclusion* is required. If the *condition* is `FALSE` and `if()` is called in combination with `else()`, then also the *alternative* is required.

__[Q2]{.Q}__: What are the arguments to the `for()` and `while()` calls?

__[A]{.solved}__: `for()` requires an *index* (called `var` in the docs), a *sequence* and an *expression*, for example

```{r}
`for`(i, 1:3, {print(i)})
```

`while()` requires a *condition* and an *expression*. Again, an example in prefix notation:

```{r}
set.seed(123)
`while`((i <- rnorm(1)) < 1, {print(i)})
i
```

Note that a minimal expression can consist of `{` only.

__[Q3]{.Q}__: Two arithmetic operators can be used in both prefix and infix style.
What are they?

__[A]{.solved}__: I am not sure how this is meant to be. Theoretically every arithmetic operator can be written in prefix notation via backticks. On the other hand, `+` and `-` seem to be the only ones, which can be written in infix notation without backticks.

```{r}
x <- 1

+(x)
-(x)
```

However, when we look more closely, the call tree is not what we would expect from a prefix function

```{r}
lobstr::ast(+(x))  
lobstr::ast(-(x))  
```

So maybe it is meant to look like this...

```{r}
lobstr::ast(+x)
lobstr::ast(-x)
```

Of course also this doesn't make too much sense, since in `?Syntax` one can read, that R clearly differentiates between unary and binary `+` and `-` operators and a unary operator is not really what we mean, when we speak about infix operators.

However, if we don't differentiate in this way, this is probably the solution, since it's obviously also an infix function:

```{r}
lobstr::ast(x + y)
lobstr::ast(x - y)
```


## Quasiquotation (new)

1.  __[Q]{.Q}__: Why does `as.Date.default()` use `substitute()` and `deparse()`? Why does `pairwise.t.test()` use them? Read the source code.

__[A]{.solved}__: `as.Date.default()` uses them to convert unexpected input expressions (neither dates, nor `NAs`) into a character string and return it within an error message.

`pairwise.t.test()` uses them to convert the names of its datainputs (response vector `x` and grouping factor `g`) into character strings to format these further into a part of the desired output.

1.  __[Q]{.Q}__: `pairwise.t.test()` assumes that `deparse()` always returns a length one character vector. Can you construct an input that violates this expectation? What happens?

__[A]{.solved}__: We can pass an expression to one of `pairwise.t.test()`'s data input arguments, which exceeds the default cutoff width in `deparse()`. The expression will be split into a character vector of length greater 1. The deparsed data inputs are directly pasted (read the source code!) with "and" as separator and the result is just used to be displayed in the output. Just the data.name output will change (it will include more than one "and").

```{r}
d=1
pairwise.t.test(2, d+d+d+d+d+d+d+d+d+d+d+d+d+d+d+d+d)
```

## FO

### Behavioural FOs

2.  __[Q]{.Q}__: What does the following function do? What would be a good name for it?

```{r}
f <- function(g) {
  force(g)
  result <- NULL
  function(...) {
    if (is.null(result)) {
      result <<- g(...)
    }
    result
  }
}
runif2 <- f(runif)
runif2(5)
runif2(10)
```

__[A]{.solved}__: It returns a new version of the inputfunction. That version will always return the result of it's first run (in case this not `NULL`), no matter how the input changes. Good names could be `first_run()` or `initial_return()`.

3.  __[Q]{.Q}__: Modify `delay_by()` so that instead of delaying by a fixed amount of time,
it ensures that a certain amount of time has elapsed since the function 
was last called. That is, if you called 
`g <- delay_by(1, f); g(); Sys.sleep(2); g()` there shouldn't be an 
extra delay.

__[A]{.solved}__:
4.  __[Q]{.Q}__: Write `wait_until()` which delays execution until a specific time.

__[A]{.solved}__:

```{r, eval = FALSE}
wait_until <- function(time, f) {
  force(f)
  function(...) {
    while (Sys.time() < time) {}
    return(f(...))
  }
}

# a little test
ptm <- proc.time()
m <- wait_until(Sys.time() + 10, mean)
m(1:3)
proc.time() - ptm
```

5.  __[Q]{.Q}__: There are three places we could have added a memoise call: why did we
choose the one we did?

```{r, eval = FALSE}
download <- memoise(dot_every(10, delay_by(1, download_file)))
download <- dot_every(10, memoise(delay_by(1, download_file)))
download <- dot_every(10, delay_by(1, memoise(download_file)))
```

__[A]{.solved}__: The second was chosen. It's easy to see why, if we eliminate the other two options:

* The first version only prints a dot at every tenth `download()` call with a new input.
This is because `dot_every()` is inside of `memoise()` and the counter created by
`dot_every()` is not "activated" if the input is known.

* The third version takes one second for every call. Even if we already know the result and
don't download anything again.

6.  __[Q]{.Q}__: Why is the `remember()` function inefficient? How could you implement it
in more efficient way?

7.  __[Q]{.Q}__: Why does the following code, from
[stackoverflow](http://stackoverflow.com/questions/8440675), not do what you expect?

```{r}
# return a linear function with slope a and intercept b.
f <- function(a, b) function(x) a * x + b

# create a list of functions with different parameters.
fs <- Map(f, a = c(0, 1), b = c(0, 1))

fs[[1]](3)
# should return 0 * 3 + 0 = 0
```

How can you modify `f` so that it works correctly?

__[A]{.solved}__: You can read in the [stackoverflow](http://stackoverflow.com/questions/8440675) link that the question arose, because the original return of
`fs[[1]](3)` was `4`, which is due to lazy evaluation and could be solved by two users via `force()`:

```{r, eval = FALSE}
f <- function(a, b) {force(a); force(b); function(x) a * x + b}
```

However you can see in the result within the question that **R**'s behaviour was changed in this case and as Jan Kislinger points out on [twitter](https://twitter.com/JanKislinger/status/794433891486547968):

> The real question should be: "How did they modify #rstats so that it works correctly?" otherwise it's a tricky question :D

Note that the same issue appears in the [textbook](http://adv-r.had.co.nz/Function-operators.html#behavioural-fos):

> In the following example, we take a list of functions and delay each one. But when we try to evaluate the mean, we get the sum instead. 

```{r, eval = FALSE}
funs <- list(mean = mean, sum = sum)
funs_m <- lapply(funs, delay_by, delay = 0.1)

funs_m$mean(1:10)
#> [1] 5.5
```

Which (as one can see) is not true anymore...actually it changed in R version [**3.2**](https://stat.ethz.ch/pipermail/r-announce/2015/000583.html):

> Higher order functions such as the apply functions and Reduce()
now force arguments to the functions they apply in order to
eliminate undesirable interactions between lazy evaluation and
variable capture in closures.  This resolves PR#16093.

For further interested: [PR#16093](https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16093#c1) will lead you to the subject "iterated lapply" within the 
[R-devel Archives](https://stat.ethz.ch/pipermail/r-devel/2015-March/subject.html#start). Note that the behaviour in for loops is still as "the old `lapply()`" behaviour.

### Output FOs

1.  __[Q]{.Q}__: Create a `negative()` FO that flips the sign of the output of the
function to which it is applied.

__[A]{.solved}__:

```{r, eval = FALSE}
negative <- function(f){
  force(f)
  function(...){
    -f(...)
  }
}
```

2.  __[Q]{.Q}__: The `evaluate` package makes it easy to capture all the outputs (results,
text, messages, warnings, errors, and plots) from an expression. Create a 
function like `capture_it()` that also captures the warnings and errors 
generated by a function.

__[A]{.solved}__: One way is just to capture the output of  `tryCatch()` with identity handlers for errors and warnings:

```{r, eval = TRUE}
capture_trials <- function(f){
  force(f)
  function(...){
    capture.output(tryCatch(f(...),
                            error = function(e) e,
                            warning = function(w) w)
    )
  }
}

# we test the behaviour
log_t <- capture_trials(log)
elements <- list(1:10, c(-1, 10), c(TRUE, FALSE), letters)
results <- lapply(elements, function(x) log_t(x))
results

# further
# results_detailed <- lapply(elements, function(x) lapply(x, function(y))log2(x))
# results_detailed
```

3.  __[Q]{.Q}__: Create a FO that tracks files created or deleted in the working directory
(Hint: use `dir()` and `setdiff()`.) What other global effects of 
functions might you want to track?

__[A]{.solved}__:

### Input FOs

1.  __[Q]{.Q}__: Our previous `download()` function only downloads a single file. How can
you use `partial()` and `lapply()` to create a function that downloads 
multiple files at once? What are the pros and cons of using `partial()` vs. 
writing a function by hand?

2.  __[Q]{.Q}__: Read the source code for `plyr::colwise()`. How does the code work? What
are `colwise()`'s three main tasks? How could you make `colwise()` simpler 
by implementing each task as a function operator? (Hint: think about 
`partial()`.)

__[A]{.started}__: We describe how it works by commenting the source code:

```{r, eval = FALSE}
function (.fun, .cols = true, ...) 
{
  # We check if .cols is not a function, since it is possible to supply a
  # predicate function.
  # if so, the .cols arguments will be "quoted", and filter() will 
  # be a function that checks and evaluates these .cols within its other argument
  if (!is.function(.cols)) {
    .cols <- as.quoted(.cols)
    filter <- function(df) eval.quoted(.cols, df)
  }
  # otherwise, filter will be be Filter(), which applies the function 
  # in .cols to every element of its other argument
  else {
    filter <- function(df) Filter(.cols, df)
  }
  # the ... arguments are caught in the list dots
  dots <- list(...)
  # a function is created, which will also be the return value.
  # it checks if its input is a data frame
  function(df, ...) {
    stopifnot(is.data.frame(df))
    # if df is split (in "plyr" speaking), this will be taken into account...
    df <- strip_splits(df)
    # now the columns of the data frame are chosen, depending on the input of .cols
    # this can chosen directly, via a predicate function, or all columns (default)
    filtered <- filter(df)
    # if this means, that no columns are selected, an empty data frame will be returned
    if (length(filtered) == 0) 
      return(data.frame())
    # otherwise lapply will be called on all filtered columns, with 
    # the .fun argument, which has to be provided by the user, and some other
    # arguments provided by the user, when calling the function (...) and
    # when defining the function (dots)
    out <- do.call("lapply", c(list(filtered, .fun, ...), 
                               dots))
    # the output will be named and converted from list into a data frame again
    names(out) <- names(filtered)
    quickdf(out)
  }
}

<environment: namespace:plyr>
  ```
  
  3.  __[Q]{.Q}__: Write FOs that convert a function to return a matrix instead of a data
  frame, or a data frame instead of a matrix. If you understand S3, 
  call them `as.data.frame.function()` and `as.matrix.function()`.
  
  __[A]{.solved}__:
  
  ```{r, eval = FALSE}
  as.matrix.function <- function(f){
    force(f)
    function(...){
      as.matrix(f(...))
    }
  }
  
  as.data.frame.function <- function(f){
    force(f)
    function(...){
      as.data.frame(f(...))
    }
  }
  ```
  
  4.  __[Q]{.Q}__: You've seen five functions that modify a function to change its output
  from one form to another. What are they? Draw a table of the various 
  combinations of types of outputs: what should go in the rows and what 
  should go in the columns? What function operators might you want to write 
  to fill in the missing cells? Come up with example use cases.
  
  5.  __[Q]{.Q}__: Look at all the examples of using an anonymous function to partially
  apply a function in this and the previous chapter. Replace the anonymous 
  function with `partial()`. What do you think of the result? Is it easier or 
  harder to read?
  
  __[A]{.solved}__: The results are easy to read. Especially the `Map()` examples profit in readability:
  
  ```{r}
  library(pryr)
  ## From Functionals
  # 1
  trims <- c(0, 0.1, 0.2, 0.5)
  x <- rcauchy(1000)
  unlist(lapply(trims, function(trim) mean(x, trim = trim)))
  unlist(lapply(trims, partial(mean, x)))
  
  # 2
  xs <- replicate(5, runif(10), simplify = FALSE)
  ws <- replicate(5, rpois(10, 5) + 1, simplify = FALSE)
  unlist(Map(function(x, w) weighted.mean(x, w, na.rm = TRUE), xs, ws))
  unlist(Map(partial(weighted.mean, na.rm = TRUE), xs, ws))
  
  # 3
  add <- function(x, y, na.rm = FALSE) {
  if (na.rm && (is.na(x) || is.na(y))) rm_na(x, y, 0) else x + y
  }
  
  r_add <- function(xs, na.rm = TRUE) {
  Reduce(function(x, y) add(x, y, na.rm = na.rm), xs)
  }
  
  r_add_compact <- function(xs, na.rm = TRUE) {
  Reduce(partial(add, na.rm = na.rm), xs)
  }
  
  r_add(1:4)
  r_add_compact(1:4)
  
  # 4
  v_add1 <- function(x, y, na.rm = FALSE) {
  stopifnot(length(x) == length(y), is.numeric(x), is.numeric(y))
  if (length(x) == 0) return(numeric())
  simplify2array(
  Map(function(x, y) add(x, y, na.rm = na.rm), x, y)
  )
  }
  
  v_add1_compact <- function(x, y, na.rm = FALSE) {
  stopifnot(length(x) == length(y), is.numeric(x), is.numeric(y))
  if (length(x) == 0) return(numeric())
  simplify2array(
  Map(partial(add, na.rm = na.rm), x, y)
  )
  }
  
  v_add1(1:3, 2:4)
  v_add1_compact(1:3, 2:4)
  
  # 5
  c_add <- function(xs, na.rm = FALSE) {
  Reduce(function(x, y) add(x, y, na.rm = na.rm), xs,
  accumulate = TRUE)
  }
  
  c_add_compact <- function(xs, na.rm = FALSE) {
  Reduce(partial(add, na.rm = na.rm), xs,
  accumulate = TRUE)
  }
  
  c_add(1:3)
  c_add_compact(1:3)
  
  ## From Function operators
  # 6
  f <- function(x) x ^ 2
  partial(f)
  
  # 7
  # Map(function(x, y) f(x, y, zs), xs, ys)
  # Map(partial(f, zs = zs), xs, yz)
  
  # 8
  # f <- function(a) g(a, b = 1)
  # f <- partial(g, b = 1)
  
  # 9
  compact <- function(x) Filter(Negate(is.null), x)
  compact <- partial(Filter, Negate(is.null))
  
  # 10
  # Map(function(x, y) f(x, y, zs), xs, ys)
  # Map(partial(f, zs = zs), xs, ys)
  
  # 11
  funs2 <- list(
  sum = function(...) sum(..., na.rm = TRUE),
  mean = function(...) mean(..., na.rm = TRUE),
  median = function(...) median(..., na.rm = TRUE)
  )
  
  funs2 <- list(
  sum = partial(sum, na.rm = TRUE),
  mean = partial(mean, na.rm = TRUE),
  median = partial(median, na.rm = TRUE)
  )
  ```
  
  ### Combining FOs
  
  1.  __[Q]{.Q}__: Implement your own version of `compose()` using `Reduce` and `%o%`. For
  bonus points, do it without calling `function`.
  
  __[A]{.solved}__: We use the definition from the textbook:
  
  ```{r}
  compose <- function(f, g) {
  function(...) f(g(...))
  }
  
  "%o%" <- compose
  ```
  
  And then we build two versions. One via an anonymous function and one via `partial()`:
  
  ```{r, eval}
  compose_red <- function(fs) {
  Reduce(function(f, g) function(...) f(g(...)), fs)
  }
  compose_red(c(mean, length, unique))(1:10)
  
  compose_red_bonus <- function(fs) {
  Reduce(partial(partial(`%o%`)), fs)
  }
  compose_red_bonus(c(mean, length, unique))(1:10)
  ```
  
  2.  __[Q]{.Q}__: Extend `and()` and `or()` to deal with any number of input functions. Can
  you do it with `Reduce()`? Can you keep them lazy (e.g., for `and()`, the 
  function returns once it sees the first `FALSE`)?
  
  __[A]{.solved}__: We use `and()` and `or()` as defined in the textbook. They are lazy, since they are build up on `&&` and `||`. Also their reduced versions stay lazy, as we will show at the end of the code
  
  ```{r}
  and <- function(f1, f2) {
  force(f1); force(f2)
  function(...) {
  f1(...) && f2(...)
  }
  }
  
  and_red <- function(fs){
  Reduce(function(f, g) and(f, g), fs)
  }
  
  or <- function(f1, f2) {
  force(f1); force(f2)
  function(...) {
  f1(...) || f2(...)
  }
  }
  
  or_red <- function(fs){
  Reduce(function(f, g) or(f, g), fs)
  }
  
  # Errors before the first TRUE will be returned
  tryCatch(
  or_red(c(is.logical, is.logical, stop, is.character))("a"), 
  error = function(e) e
  )
  
  # Errors after the first TRUE won't be returned
  or_red(c(is.logical, is.logical, is.character, stop))("a")
  ```
  
  3.  __[Q]{.Q}__: Implement the `xor()` binary operator. Implement it using the existing
  `xor()` function. Implement it as a combination of `and()` and `or()`. What 
  are the advantages and disadvantages of each approach? Also think about 
  what you'll call the resulting function to avoid a clash with the existing
  `xor()` function, and how you might change the names of `and()`, `not()`, 
  and `or()` to keep them consistent.
  
  __[A]{.started}__: Both versions are implemented straight forward, as also the reduced versions. However, the parallel versions need a little bit more care:
  
  ```{r, error = TRUE}
  xor_fb1 <- function(f1, f2){
  force(f1); force(f2)
  function(...){
  xor(f1(...), f2(...)) 
  }
  }
  
  xor_fb2 <- function(f1, f2){
  force(f1); force(f2)
  function(...){
  or(f1, f2)(...) && !(and(f1, f2)(...))
  }
  }
  
  # binary combination
  xor_fb1(is.logical, is.character)("a")
  xor_fb2(is.logical, is.character)("a")
  
  # parallel combination (results in an error)
  xor_fb1(c(is.logical, is.character), c(is.logical, is.character))("a")
  xor_fb2(c(is.logical, is.character), c(is.logical, is.character))("a")
  
  # reduced combination (results in an error)
  xor_fb1(c(is.logical, is.character, is.logical, is.character))("a")
  xor_fb2(c(is.logical, is.character, is.logical, is.character))("a")
  
  ### Reduced version
  xor_fb1_red <- function(fs){
  Reduce(function(f, g) xor_fb1(f, g), fs)
  }
  
  xor_fb2_red <- function(fs){
  Reduce(function(f, g) xor_fb2(f, g), fs)
  }
  
  # should return TRUE
  xor_fb1_red(c(is.logical, is.character, is.logical, is.character))("a")
  xor_fb2_red(c(is.logical, is.character, is.logical, is.character))("a")
  
  # should return FALSE
  xor_fb1_red(c(is.logical, is.logical, is.character, is.logical))("a")
  xor_fb2_red(c(is.logical, is.logical, is.character, is.logical))("a")
  
  # should return FALSE
  xor_fb1_red(c(is.logical, is.logical, is.character, is.character))("a")
  xor_fb2_red(c(is.logical, is.logical, is.character, is.character))("a")
  ```
  
  4.  __[Q]{.Q}__: Above, we implemented boolean algebra for functions that return a logical
  function. Implement elementary algebra (`plus()`, `minus()`, `multiply()`, 
  `divide()`, `exponentiate()`, `log()`) for functions that return numeric 
  vectors.
  
  __[A]{.solved}__:
  
  ```{r, eval = FALSE}
  plus <- function(f1, f2) {
  force(f1); force(f2)
  function(...) {
  f1(...) + f2(...)
  }
  }
  
  minus <- function(f1, f2) {
  force(f1); force(f2)
  function(...) {
  f1(...) - f2(...)
  }
  }
  
  multiply <- function(f1, f2) {
  force(f1); force(f2)
  function(...) {
  f1(...) * f2(...)
  }
  }
  
  divide <- function(f1, f2) {
  force(f1); force(f2)
  function(...) {
  f1(...) / f2(...)
  }
  }
  
  exponentiate <- function(f1, f2) {
  force(f1); force(f2)
  function(...) {
  f1(...) ^ f2(...)
  }
  }
  
  # we rename log to log_ since log() already exists
  log_ <- function(f1, f2) {
  force(f1); force(f2)
  function(...) {
  log(f1(...), f2(...))
  }
  }
  
  # Test
  mns <- minus(mean, function(x) x^2)
  mns(1:5)
  ```
  
  ## Expressions (again)
  
  ### Data structures
  
  __[Q1]{.Q}__: How is `rlang::maybe_missing()` implemented? Why
  does it work?
  
  __[A]{.solved}__: Let us take a look at the functions source code to see what's going on
  
  ```{r, eval = FALSE}
  lang::maybe_missing
  function (x) 
  {
    # is_missing checks if one of the following is TRUE
    # 1. check via substitute if typeof(x) is symbol and missing(x) is TRUE
    # 2. check if x identical to missing_arg()
    if (is_missing(x)) {
      missing_arg()  # returns missing argument
      # implemented in lower level code -> .Call())
    }
    else {
      x  # when it's not missing, x is simply returned
    }
  }
  <bytecode: 0x00000000195ed740>
    <environment: namespace:rlang>
    ```
    <!-- HB: I think, I would try to split the explanation into two parts. One Overfiew with comments in the functions source code and then some prose summarising what's going on at a little higher level. Why does it work?-->    
    
    First it is checked if the argument is missing. If so, the missing arg is returned, otherwise the argument (`x`) itsself is returned.
    
    ### Parsing and deparsing
    
    __[Q1]{.Q}__: Why does as.Date.default() use substitute() and deparse()? Why does pairwise.t.test() use them? Read the source code.
    
    __[A]{.solved}__:
    
    ### R's grammar
    
    __[Q1]{.Q}__: `deparse()` produces vectors when the input is long. For example, the following call produces a vector of length two:
    
    ```{r, eval = TRUE}
    expr <- rlang::expr(g(a + b + c + d + e + f + g + h + i + j + k + l + m +
                            n + o + p + q + r + s + t + u + v + w + x + y + z))
    
    deparse(expr)
    ```
    
    What do `expr_text()`, `expr_name()`, and `expr_label()` do with this input?
    
    __[A]{.solved}__:
    
    * `expr_text()` pastes the output string into one and inserts `\n` (new line identifiers) as separators
    
    ```{r}
    cat(rlang::expr_text(expr)) # cat is used for printing with linebreak
    ```
    
    * `expr_name()` recreates the call into the form f(...) and deparses this expression into a string
    
    ```{r}
    rlang::expr_name(expr)
    ```
    
    * `expr_label()` does the same as `expr_name()`, but surrounds the output also with backticks
    
    ```{r}
    rlang::expr_label(expr)
    ```
    
    ## Quasiquotation
    
    ### Old exercises Unquoting
    
    __[Q1]{.Q}__: What does the following command return? What information is lost? Why?
    
    ```{r, eval = FALSE}
    expr({
      x +              y # comment  
    })
    ```
    
    __[A]{.solved}__: When we look at the captured expression, we see that the extra whitespaces and comments are lost. R ignores them when parsing an expression. They do do not need to be represented in the AST, because they do not affect the evaluation of the expression.
    
    ```{r}
    library(rlang)
    captured_expression <- expr({
      x +              y # comment  
    })
    
    captured_expression
    ```
    
    However, it is possible to retrieve the original input through the attributes of the captured expression:
    
    ```{r}
    attributes(captured_expression)
    ```
    
    ### Unquoting
    
    __[Q2]{.Q}__: Explain why both `!0 + !0` and `!1 + !1` return `FALSE` while `!0 + !1` returns `TRUE`.
    
    __[A]{.solved}__: To answer this question we look at the AST of the first example:
    
    ```{r}
    library(lobstr)
    
    ast(!0 + !0)
    ```
    
    As the coercion rules are the same in all examples, we can use the precedence order (right to left) to explain all three examples:
    
    * `!0 + !0`:  
    So the second zero gets coerced to `FALSE` and `!FALSE` becomes `TRUE`.  
    `0 + TRUE` gets coerced to 1.  
    `!1` becomes `!TRUE` which is `FALSE`  
    * `!1 + !1`:  
    So `!1` is `FALSE`.  
    `1 + FALSE` is `1`.  
    `!1` is `!TRUE` so `FALSE`.  
    * `!0 + !1`:  
    `!1` is `FALSE`.  
    `0 + FALSE` is `0`.  
    `!0` is `TRUE`.  
    
    __[Q3]{.Q}__: Base functions `match.fun()`, `page()`, and `ls()` all try to automatically determine whether you want standard or non-standard evaluation. Each uses a different approach. Figure out the essence of each approach by reading the source code, then compare and contrast the techniques.
    
    ### Case studies {#quasi-case-studies}
    
    __[Q1]{.Q}__: Implement `arrange_desc()`, a variant of `dplyr::arrange()` that sorts in descending order by default.
    
    __[A]{.solved}__: We just have to catch the `...` from `arrange()` as an expression and modify the expression to be wrapped inside `desc()`. Afterwards we evaluate this new code within a regular `arrange()` call:
    
    ```{r}
    library(dplyr)
    library(purrr)
    
    arrange_desc <- function(.data, ...){
      increasing <- enexprs(...)
      decreasing <- map(increasing, ~ expr(desc(!!.x)))
      
      arrange(.data, !!!decreasing)
    }
    ```
    
    Let's try it out
    
    ```{r}
    d <- data.frame(abc = letters[1:6],
                    id1 = 1:6,
                    id2 = rep(1:2, 3))
    
    # old behaviour
    d %>% arrange(id2, id1)
    
    # new descending behaviour
    d %>% arrange_desc(id2, id1)
    ```
    
    __[Q2]{.Q}__: Implement `filter_or()`, a variant of `dplyr::filter()` that combines multiple arguments using `|` instead of `&`.
    
    __[A]{.solved}__: This time we just need to collapse the `...` arguments with `|`. Therefore we can use `purrr::reduce()` and afterwards we just need to evaluate the new code within a regular filter call:
    
    ```{r}
    filter_or <- function(.data, ...){
      normal <- enexprs(...)
      
      normal_or <- reduce(normal, function(x, y) expr(!!x | !!y))
      
      filter(.data, !!!normal_or)
    }
    
    # and test it
    d <- data.frame(x = 1:6, y = 6:1)
    filter_or(d, x < 3, y < 3)
    ```
    
    __[Q3]{.Q}__:Implement `partition_rows()` which, like `partition_cols()`, returns two data frames, one containing the selected rows, and the other containing the rows that weren't selected.
    
    __[A]{.solved}__: We just have to decide if we focus on integer subsetting via `dplyr::slice()` or logical subsetting via `dplyr::filter()`. The rest is straightforward. Since the implementations of both subsetting styles are completely equivalent we just choose one without any particular reason:
    
    ```{r}
    partition_rows <- function(.data, ...){
      included <- enexprs(...)
      excluded <- map(included, ~ expr(!(!!.x)))
      
      list(
        incl = filter(.data, !!!included),
        excl = filter(.data, !!!excluded)
      )
    }
    
    d <- data.frame(x = 1:6, y = 6:1)
    partition_rows(d, x <= 3)
    ```
    
    __[Q4]{.Q}__:Add error handling to `slice()`. Give clear error messages if either `along` or `index` have invalid values (i.e. not numeric, not length 1, too small, or too big).
    
    
    ## More old evaluation exercises
    
    __[Q1]{.Q}__: Run this code in your head and predict what it will print. Confirm or refute your prediction by running the code in R.
    
    ```{r, results = FALSE}
    f <- function(...) {
      x <- "f"
      g(f = x, ...)
    }
    g <- function(...) {
      x <- "g"
      h(g = x, ...)
    }
    h <- function(...) {
      enquos(...)
    }
    x <- "top"
    
    out <- f(top = x)
    out
    purrr::map_chr(out, eval_tidy)
    ```
    
    __[Q1]{.Q}__: What happens if you use `expr()` instead of `enexpr()` inside of `subset2()`?
    
    
    __[Q1]{.Q}__: Improve `subset2()` to make it more like real `base::subset()`:
    
    * Drop rows where `subset` evaluates to `NA`
    * Give a clear error message if `subset` doesn't yield a logical vector
    * What happens if `subset` yields a vector that's not the same as the number of rows in `data`? What do you think should happen?
    
    __[Q2]{.Q}__: The third argument in `base::subset()` allows you to select variables. It treats variable names as if they were positions. This allows you to do things like `subset(mtcars, , -cyl)` to drop the cylinder variable, or `subset(mtcars, , disp:drat)` to select all the variables between `disp` and `drat`. How does this work? I've made this easier to understand by extracting it out into its own function that uses tidy evaluation.
    
    ```{r, eval = FALSE}
    select <- function(df, vars) {
      vars <- enexpr(vars)
      var_pos <- set_names(as.list(seq_along(df)), names(df))
      
      cols <- eval_tidy(vars, var_pos)
      df[, cols, drop = FALSE]
    }
    select(mtcars, -cyl)
    ```
    
    __[Q3]{.Q}__: Here's an alternative implementation of `arrange()`:
    
    ```{r}
    invoke <- function(fun, ...) do.call(fun, dots_list(...))
    arrange3 <- function(.data, ..., .na.last = TRUE) {
      args <- enquos(...)
      
      ords <- purrr::map(args, eval_tidy, data = .data)
      ord <- invoke(order, !!!ords, na.last = .na.last)
      
      .data[ord, , drop = FALSE]
    }
    ```
    
    Describe the primary difference in approach compared to the function defined in the text. 
    
    One advantage of this approach is that you could check each element of `...` to make sure that input is correct. What property should each element of `ords` have?
    
    __[Q4]{.Q}__: Here's an alternative implementation of `subset2()`:
    
    ```{r}
    subset3 <- function(data, rows) {
      eval_tidy(quo(data[!!enquo(rows), , drop = FALSE]), data = data)
    }
    ```
    
    Use intermediate variables to make the function easier to understand, then
    explain how this approach differs to the approach in the text.
    
    __[Q5]{.Q}__: Implement a form of `arrange()` where you can request a variable to sorted in descending order using named arguments:
    
    ```{r, eval = FALSE}
    arrange(mtcars, cyl, desc = mpg, vs)
    ```
    
    (Hint:  The `descreasing` argument to `order()` will not help you. Instead, look at the definition of `dplyr::desc()`, and read the help for `xtfrm()`.)
    
    __[Q6]{.Q}__: Why do you not need to worry about ambiguous argument names with `...` in
    `arrange()`? Why is it a good idea to use the `.` prefix anyway?
    
    __[Q7]{.Q}__: What does `transform()` do? Read the documentation. How does it work?
    Read the source code for `transform.data.frame()`. What does `substitute(list(...))` do?
    
    __[Q8]{.Q}__: Use tidy evaluation to implement your own version of `transform()`.
    Extend it so that a calculation can refer to variables created by transform, i.e. make this work:
    
    ```{r, error = TRUE}
    df <- data.frame(x = 1:3)
    transform(df, x1 = x + 1, x2 = x1 + 1)
    ```
    
    __[Q9]{.Q}__: What does `with()` do? How does it work? Read the source code for `with.default()`. What does `within()` do? How does it work? Read the source code for `within.data.frame()`. Why is the code so much more
    complex than `with()`?
    
    __[Q10]{.Q}__: Implement a version of `within.data.frame()` that uses tidy evaluation.
    Read the documentation and make sure that you understand what `within()` does, then read the source code.
    
    <!-- ## Wrapping quoting functions -->
    
    __[Q1]{.Q}__: When model building, typically the response and data are relatively constant while you rapidly experiment with different predictors. Write a small wrapper that allows you to reduce duplication in this situation.
    
    ```{r, eval = FALSE}
    pred_mpg <- function(resp, ...) {
      
    }
    pred_mpg(~ disp)
    pred_mpg(~ I(1 / disp))
    pred_mpg(~ disp * cyl)
    ```
    
    __[Q2]{.Q}__: Another way to way to write `boot_lm()` would be to include the boostrapping expression (`data[sample(nrow(data), replace = TRUE), , drop = FALSE]`) in the data argument. Implement that approach. What are the advantages? What are the disadvantages?
    
    __[Q3]{.Q}__: To make these functions somewhat more robust, instead of always using the `caller_env()` we could capture a quosure, and then use its environment. However, if there are multiple arguments, they might be associated with different environments. Write a function that takes a list of quosures, and returns the common environment, if they have one, or otherwise throws an error.
    
    __[Q4]{.Q}__: Write a function that takes a data frame and a list of formulas, fitting a linear model with each formula, generating a useful model call.
    
    __[Q5]{.Q}__: Create a formula generation function that allows you to optionally supply a transformation function (e.g. `log()`) to the response or the predictors.
    
    ## Deprecated evaluation basics
    
    __[Q1]{.Q}__: The code generated by `source2()` lacks source references. Read the source code for `sys.source()` and the help for `srcfilecopy()`, then modify `source2()` to preserve source references. You can test your code by sourcing a function that contains a comment. If successful, when you look at the function, you'll see the comment and not just the source code.
    
    __[A]{.started}__:
    
    ```{r, error=TRUE}
    tmp_file <- tempfile()
    writeLines('x <- 1
    test_function <- function() {
    "source me!"  # testcomment
    }', tmp_file)
    
    file <- tmp_file
    
    source2 <- function(file, env = caller_env()){
      lines <- readLines(file)
      srcfile <- srcfilecopy(file, lines)
      
      parse(text = lines, srcfile = srcfile, keep.source = TRUE) %>% 
        map(eval_tidy, env = env) 
    }
    
    source2(tmp_file)
    test_function
    ```
    
    - the comment is still missing
    
    ## Measuring performance (Old exercises)
    __[Q1]{.Q}__: Instead of using `microbenchmark()`, you could use the built-in function `system.time()`. But `system.time()` is much less precise, so you'll need to repeat each operation many times with a loop, and then divide to find the average time of each operation, as in the code below.
    
    ```{r, eval = FALSE}
    n <- 1:1e6
    system.time(for (i in n) sqrt(x)) / length(n)
    system.time(for (i in n) x ^ 0.5) / length(n)
    ```
    
    How do the estimates from `system.time()` compare to those from `microbenchmark()`? Why are they different?
    
    __[Q2]{.Q}__: Here are two other ways to compute the square root of a vector. Which do you think will be fastest? Which will be slowest? Use microbenchmarking to test your answers.
    
    ```{r, eval = FALSE}
    x ^ (1 / 2)
    exp(log(x) / 2)
    ```
    
    __[A]{.solved}__: The second one looks more complex, but you never know...unless you test it.
    
    ```{r}
    x <- runif(100)
    microbenchmark::microbenchmark(
      sqrt(x),
      x ^ 0.5,
      x ^ (1 / 2),
      exp(log(x) / 2)
    )
    ```
    
    __[Q3]{.Q}__: Use microbenchmarking to rank the basic arithmetic operators (`+`, `-`, `*`, `/`, and `^`) in terms of their speed. Visualise the results. Compare the speed of arithmetic on integers vs. doubles.
    
    __[A]{.solved}__: On a Windows system, these short execution times are hard to measure. The following code was executed on Linux and pasted here:
    
    ```{r, eval = FALSE}
    mb_integer <- microbenchmark::microbenchmark(
      1L + 1L, 1L - 1L, 1L * 1L, 1L / 1L, 1L ^ 1L, 
      times = 1000000,
      control = list(order = "random",
                     warmup = 20000))
    
    mb_double <- microbenchmark::microbenchmark(
      1 + 1, 1 - 1, 1 * 1, 1 / 1, 1 ^ 1, 
      times = 1000000,
      control = list(order = "random",
                     warmup = 20000))
    
    mb_integer
    # and got the following output:
    # Unit: nanoseconds
    #     expr min lq      mean median uq     max neval
    #  1L + 1L  50 66  96.45262     69 73 7006051 1e+06
    #  1L - 1L  52 69  88.76438     71 76  587594 1e+06
    #  1L * 1L  51 68  88.51854     70 75  582521 1e+06
    #    1L/1L  50 65  94.40669     68 74 7241972 1e+06
    #    1L^1L  67 77 102.96209     84 92  574519 1e+06
    
    mb_double
    # Unit: nanoseconds
    #   expr min lq      mean median  uq      max neval
    #  1 + 1  48 66  92.44331     69  75  7217242 1e+06
    #  1 - 1  50 66  88.13654     68  77   625462 1e+06
    #  1 * 1  48 66 135.88379     70  77 42974915 1e+06
    #    1/1  48 65  87.11615     69  77   659032 1e+06
    #    1^1  79 92 127.07686    103 135   641524 1e+06
    ```
    
    Let's compare the results visually.
    
    ```{r}
    mb_median <- data.frame(operator = c("+", "-", "*", "/", "^"),
                            int = c(69, 71, 70, 68, 84),  # same as mb_integer$median
                            dbl = c(69, 68, 70, 69, 103), # same as mb_double$median
                            stringsAsFactors = FALSE)
    
    mb_median <- tidyr::gather(mb_median, type, time, int, dbl)
    mb_median <- dplyr::mutate(mb_median, type = factor(type, levels = c("int", "dbl")))
    
    library(ggplot2)
    ggplot(mb_median, aes(x = type, y = time, group = operator, color = operator)) +
      geom_point(show.legend = FALSE) +
      geom_line(show.legend = FALSE, size = 1.5) +
      geom_label(aes(label = operator), show.legend = FALSE) +
      theme_minimal() +
      ylab("time in nanoseconds") +
      theme(axis.title.x = element_blank(),
            axis.title.y = element_text(size = 14),
            axis.text.x = element_text(size = 14),
            axis.text.y = element_text(size = 10)) +
      scale_y_continuous(breaks = seq(0, max(mb_median$time), 10))
    ```
    
    __[Q4]{.Q}__: You can change the units in which the microbenchmark results are
    expressed with the `unit` parameter. Use `unit = "eps"` to show
    the number of evaluations needed to take 1 second. Repeat the benchmarks
    above with the eps unit. How does this change your intuition for performance?
    
    <!-- ## Language performance -->
    
    __[Q1]{.Q}__: `scan()` has the most arguments (21) of any base function. About how much time does it take to make 21 promises each time scan is called? Given a simple input (e.g., `scan(text = "1 2 3", quiet = T)`) what proportion of the total run time is due to creating those promises?
    
    __[A]{.solved}__: According to the textbook every extra argument slows the function down by approximately 20 nanoseconds, which I can't reproduce on my system:
    
    ```{r}
    f5 <- function(a = 1, b = 2, c = 4, d = 4, e = 5) NULL
    f6 <- function(a = 1, b = 2, c = 4, d = 4, e = 5, f = 6) NULL
    f7 <- function(a = 1, b = 2, c = 4, d = 4, e = 5, f = 6, g = 7) NULL
    f8 <- function(a = 1, b = 2, c = 4, d = 4, e = 5, f = 6, g = 7, h = 8) NULL
    microbenchmark::microbenchmark(f5(), f6(), f7(), f8(), times = 10000)
    ```
    
    However, for now we just assume that 20 nanoseconds are correct and in kind of doubt, we recommend to benchmark this value individually. With this assumption we calculate `21 * 20 = 420` nanoseconds of extra time for each call of `scan()`.
    
    For a percentage, we first benchmark a simple call of `scan()`:
    
    ```{r}
    (mb_prom <- microbenchmark::microbenchmark(
      scan(text = "1 2 3", quiet = T),
      times = 100000,
      unit = "ns",
      control = list(warmup = 1000)
    ))
    
    mb_prom_median <- summary(mb_prom)$median
    ```
    
    This lets us calculate, that ~`r round(420 / mb_prom_median, 4) * 100`% of the median run time are caused by the extra arguments.
    
    __[Q2]{.Q}__: Read ["Evaluating the Design of the R Language"](http://r.cs.purdue.edu/pub/ecoop12.pdf). What other aspects of the R-language slow it down? Construct microbenchmarks to illustrate.
    
    __[Q3]{.Q}__: How does the performance of S3 method dispatch change with the length of the class vector? How does performance of S4 method dispatch change with number of superclasses? How about RC?
    
    __[Q4]{.Q}__: What is the cost of multiple inheritance and multiple dispatch on S4 method dispatch?
    
    __[Q5]{.Q}__: Why is the cost of name lookup less for functions in the base package?
    
    <!-- ## Implementations performance -->
    
    __[Q1]{.Q}__: The performance characteristics of `squish_ife()`, `squish_p()`, and `squish_in_place()` vary considerably with the size of `x`. Explore the differences. Which sizes lead to the biggest and smallest differences?
    
    __[Q2]{.Q}__: Compare the performance costs of extracting an element from a list, a column from a matrix, and a column from a data frame. Do the same for rows.
    
    ## Improving Performance
    
    __[Q1]{.Q}__: How many different ways can you compute a 1d density estimate in R?
    
    __[A]{.solved}__: According to [Deng and Wickham (2011)](https://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=3&ved=0ahUKEwim0sTI9JLRAhVScFAKHdLuBOIQFggxMAI&url=http%3A%2F%2Fvita.had.co.nz%2Fpapers%2Fdensity-estimation.pdf&usg=AFQjCNFBdOT4DvSTXtGoawIFkgf6JlXV9Q&sig2=APnTeIL55Zj_0oATw15mBA&cad=rja) density estimation is implemented in over 20 R packages.
    
    
    <!-- ## Memory profiling with lineprof -->
    
    __[Q1]{.Q}__: When the input is a list, we can make a more efficient `as.data.frame()` by using special knowledge. A data frame is a list with class `data.frame` and `row.names` attribute. `row.names` is either a character vector or vector of sequential integers, stored in a special format created by `.set_row_names()`. This leads to an alternative `as.data.frame()`:
    
    ```{r}
    to_df <- function(x) {
      class(x) <- "data.frame"
      attr(x, "row.names") <- .set_row_names(length(x[[1]]))
      x
    }
    ```
    
    What impact does this function have on `read_delim()`?  What are the downsides of this function?
    
    __[Q1]{.Q}__: Line profile the following function with `torture = TRUE`. What is surprising? Read the source code of `rm()` to figure out what's going on.
    
    ```{r}
    f <- function(n = 1e5) {
      x <- rep(1, n)
      rm(x)
    }
    ```
    
    <!-- ## Do as little as possible -->
    
    __[Q1]{.Q}__: How do the results change if you compare `mean()` and `mean.default()` on 10,000 observations, rather than on 100?
    
    __[A]{.solved}__: We start with 100 observations as shown in the textbook:
    
    ```{r}
    x <- runif(1e2)
    microbenchmark::microbenchmark(
      mean(x),
      mean.default(x)
    )
    ```
    
    In case of 10,000 observations we can observe that using `mean.default()` preserves only a small advantage over the use of `mean()`:
    
    ```{r,collapse = TRUE}
    x <- runif(1e4)
    microbenchmark::microbenchmark(
      mean(x),
      mean.default(x),
      unit = "ns"
    )
    ```
    
    When using even more observations - like in the next lines - it seems that `mean.default()` doesn't preserve anymore any advantage at all:
    
    ```{r}
    x <- runif(1e6)
    microbenchmark::microbenchmark(
      mean(x),
      mean.default(x),
      unit = "ns"
    )
    ```
    
    __[Q2]{.Q}__: The following code provides an alternative implementation of `rowSums()`. Why is it faster for this input?
    
    ```{r}
    rowSums2 <- function(df) {
      out <- df[[1L]]
      if (ncol(df) == 1) return(out)
      
      for (i in 2:ncol(df)) {
        out <- out + df[[i]]
      }
      out
    }
    
    df <- as.data.frame(
      replicate(1e3, sample(100, 1e4, replace = TRUE))
    )
    system.time(rowSums(df))
    system.time(rowSums2(df))
    ```
    
    __[A]{.solved}__:
    
    __[Q3]{.Q}__: Imagine you want to compute the bootstrap distribution of a sample correlation using `cor_df()` and the data in the example below. Given that you want to run this many times, how can you make this code faster? (Hint: the  function has three components that you can speed up.)
    
    ```{r, eval = FALSE}
    n <- 1e6
    df <- data.frame(a = rnorm(n), b = rnorm(n))
    
    cor_df <- function(df, n) {
      i <- sample(seq(n), n, replace = FALSE)
      cor(df[i, , drop = FALSE])[2, 1]
      # note also that in the last line the textbook says q[] instead of df[]. Since
      # this is probably just a typo, we changed this to df[].
    }
    ```
    
    Is there a way to vectorise this procedure?
    
    __[A]{.solved}__: The three components (mentioned in the questions hint) are:
    
    1. sampling of indices
    2. subsetting the data frame/conversion to matrix (or vector input)
    3. the `cor()` function itself.
    
    Since a run of lineprof like shown in the textbook suggests that `as.matrix()` within the `cor()` function is the biggest bottleneck, we start with that:
    
    ```{r}
    n <- 1e6
    df <- data.frame(a = rnorm(n), b = rnorm(n))
    ```
    
    Remember the outgoing function:
    
    ```{r}
    cor_df <- function() {
      i <- sample(seq(n), n, replace = FALSE)
      cor(df[i, , drop = FALSE])[2, 1]
    }
    ```
    
    First we want to optimise the second line (without attention to the `cor()` function itself). Therefore we exclude the first line from our optimisation approaches and define `i` within the global environment:
    
    ```{r}
    i <- sample(seq(n), n)
    ```
    
    Then we define our approaches, check that their behaviour is correct and do the first benchmark:
    
    ```{r}
    # old version
    cor_v1 <- function() {
      cor(df[i, , drop = FALSE])[2, 1]
    }
    
    # cbind instead of internal as.matrix
    cor_v2 <- function() {
      m <- cbind(df$a[i], df$b[i])
      cor(m)[2, 1]
    }
    
    # cbind + vector subsetting of the output matrix
    cor_v3 <- function() {
      m <- cbind(df$a[i], df$b[i])
      cor(m)[2]
    }
    
    # use vector input within the cor function, so that no conversion is needed
    cor_v4 <- function() {
      cor(df$a[i], df$b[i])
    }
    
    # check if all return the same result (if you don't get the same result on your
    # machine, you might wanna check if it is due to the precision and istead
    # run cor_v1(), cor_v2() etc.)
    cor_list <- list(cor_v1, cor_v2, cor_v3, cor_v4)
    ulapply <- function(X, FUN, ...) unlist(lapply(X, FUN, ...))
    ulapply(cor_list, function(x) identical(x(), cor_v1()))
    
    # benchmark
    set.seed(1)
    microbenchmark::microbenchmark(
      cor_v1(),
      cor_v2(),
      cor_v3(),
      cor_v4()
    )
    ```
    
    According to the resulting medians, lower and upper quartiles of our benchmark all three new versions seem to provide more or less the same speed benefit (note that the maximum and mean can vary a lot for these approaches). Since the second version is most similar to the code we started, we implement this line into a second version of `cor_df()` (if this sounds too arbitrary, note that in the final solution we will come back to the vector input version anyway) and do a benchmark to get the overall speedup:
    
    ```{r}
    cor_df2 <- function() {
      i <- sample(seq(n), n)
      m <- cbind(df$a[i], df$b[i])
      cor(m)[2, 1]
    }
    
    microbenchmark::microbenchmark(
      cor_df(),
      cor_df2()
    )
    ```
    
    Now we can focus on a speedup for the random generation of indices. (Note that a run of linepfrof suggests to optimize `cbind()`. However, after rewriting `cor()` to a version that only works with vector input, this step will be unnecessary anyway). We could try differnt approaches for the sequence generation within `sample()` (like `seq(n)`, `seq.int(n)`, `seq_len(n)`, `a:n`) and a direct call of `sample.int()`. In the following, we will see, that `sample.int()` is always faster (since we don't include the generation of the sequence into our benchmark). When we look into `sample.int()` we see that it calls two different internal sample versions depending on the input. Since in our usecase always the second version will be called, we also provide this version in our benchmark: 
    
    ```{r}
    seq_n <- seq(n)
    
    microbenchmark::microbenchmark(
      sample(seq_n, n),
      sample.int(n, n),
      .Internal(sample(n, n, replace = FALSE, prob = NULL))
    )
    ```
    
    The `sample.int()` versions give clearly the biggest improvement. Since the internal version doesn't provide any clear improvement, but restricts the general scope of our function, we choose to implement `sample.int()` in a third version of `cor_df()` and benchmark our actual achievements:
    
    ```{r}
    cor_df3 <- function() {
      i <- sample.int(n, n)
      m <- cbind(df$a[i], df$b[i])
      cor(m)[2, 1]
    }
    
    microbenchmark::microbenchmark(
      cor_df(),
      cor_df2(),
      cor_df3()
    )
    ```
    
    As a last step, we try to speedup the calculation of the pearson correlation coefficient. Since quite a lot of functionality is build into the `stats::cor()` function this seems like a reasonable approach. We try this by working with another `cor()` function from the `WGCNA` package and an own implementation which should give a small improvement, because we use `sum(x) / length(x)` instead of `mean(x)` for internal calculations:
    
    ```{r}
    #WGCNA version (matrix and vector). Note that I don't use a local setup which uses
    #the full potential of this function. For furter information see ?WGCNA::cor
    cor_df4m <- function() {
      i <- sample.int(n, n)
      m <- cbind(df$a[i], df$b[i])
      WGCNA::cor(m)[2]
    }
    
    cor_df4v <- function() {
      i <- sample.int(n, n)
      WGCNA::cor(df$a[i], df$b[i], quick = 1)[1]
    }
    
    #New implementation of underlying cor function
    #A definition can be found for example here
    #http://www.socscistatistics.com/tests/pearson/
    cor2 <- function(x, y){
      xm <- sum(x) / length(x)
      ym <- sum(y) / length(y)
      x_xm <- x - xm
      y_ym <- y - ym
      numerator <- sum((x_xm) * (y_ym))
      denominator <- sqrt(sum(x_xm^2)) * sqrt(sum(y_ym^2))
      return(numerator / denominator)
    }
    
    cor2 <- compiler::cmpfun(cor2)
    
    cor_df5 <- function() {
      i <- sample.int(n, n)
      cor2(df$a[i], df$b[i])
    }
    ```
    
    In our final benchmark, we also include compiled verions of all our attempts:
    ```{r, eval = FALSE}
    cor_df_c <- compiler::cmpfun(cor_df)
    cor_df2_c <- compiler::cmpfun(cor_df2)
    cor_df3_c <- compiler::cmpfun(cor_df3)
    cor_df4m_c <- compiler::cmpfun(cor_df4m)
    cor_df4v_c <- compiler::cmpfun(cor_df4v)
    cor_df5_c <- compiler::cmpfun(cor_df5)
    
    microbenchmark::microbenchmark(
      cor_df(),
      cor_df2(),
      cor_df3(),
      cor_df4m(),
      cor_df4v(),
      cor_df5(),
      cor_df_c(),
      cor_df2_c(),
      cor_df3_c(),
      cor_df4m_c(),
      cor_df4v_c(),
      cor_df5_c()
    )
    #> 
    #> Unit: milliseconds
    #>          expr      min       lq      mean    median        uq        max
    #>      cor_df() 7.885838 9.146994 12.739322  9.844561 10.447188   83.08044
    #>     cor_df2() 1.951941 3.178821  6.101036  3.339742  3.485085   36.25404
    #>     cor_df3() 1.288831 1.943510  4.353773  2.018105  2.145119   66.74022
    #>    cor_df4m() 1.534061 2.156848  6.344015  2.243540  2.370921  234.17307
    #>    cor_df4v() 1.639997 2.271949  5.755720  2.349477  2.453214  196.49897
    #>     cor_df5() 1.281133 1.879911  6.263500  1.932696  2.064292  237.83135
    #>    cor_df_c() 7.600654 9.337239 13.613901 10.009330 10.965688   41.17146
    #>   cor_df2_c() 2.009124 2.878242  5.645224  3.298138  3.469871   34.29037
    #>   cor_df3_c() 1.312291 1.926281  4.426418  1.986948  2.094532   87.97220
    #>  cor_df4m_c() 1.548357 2.179025  3.306796  2.242991  2.355709   23.87195
    #>  cor_df4v_c() 1.669689 2.255087 27.148456  2.341962  2.490419 2263.33230
    #>   cor_df5_c() 1.284065 1.888892  3.884798  1.953774  2.078955   25.81583
    #>  neval cld
    #>    100   a
    #>    100   a
    #>    100   a
    #>    100   a
    #>    100   a
    #>    100   a
    #>    100   a
    #>    100   a
    #>    100   a
    #>    100   a
    #>    100   a
    #>    100   a
    ```
    
    Our final solution benefits most from the switch from data frames to vectors. Working with `sample.int` gives only little improvement. Reimplementing and compiling a new correlation function adds only minimal speedup.
    
    To trust our final result we include a last check for similar return values:
    
    ```{r, eval = FALSE}
    set.seed(1)
    cor_df()
    #> [1] -0.001441277
    set.seed(1)
    cor_df5_c()
    #> [1] -0.001441277
    ```
    
    Vectorisation of this problem seems rather difficult, since attempts of using matrix calculus, always depend on building and handling big matrices in the first place.
    
    We can for example rewrite our correlation function to work with matrices and build a new (vectorised) version of `cor_df()` on top of that
    
    ```{r}
    cor2m <- function(x, y){
      n_row <- nrow(x)
      xm <- colSums(x) / n_row
      ym <- colSums(y) / n_row
      x_xm <- t(t(x) - xm)
      y_ym <- t(t(y) - ym)
      numerator <- colSums((x_xm) * (y_ym))
      denominator <- sqrt(colSums(x_xm^2)) * sqrt(colSums(y_ym^2))
      return(numerator / denominator)
    }
    
    cor_df_v <- function(i){
      indices <- replicate(i, sample.int(n, n), simplify = "array")
      x <- matrix(df$a[indices], ncol = i)
      y <- matrix(df$b[indices], ncol = i)
      cor2m(x, y)
    }
    cor_df_v <- compiler::cmpfun(cor_df_v)
    ```
    
    However this still doesn't provide any improvement over the use of `lapply()`:
    
    ```{r, eval = FALSE}
    ulapply2 <- function(X, FUN, ...) unlist(lapply(X, FUN, ...), use.names = FALSE)
    
    microbenchmark::microbenchmark(
      cor_df5_c(),
      ulapply2(1:100, function(x) cor_df5_c()),
      cor_df_v(100)
    )
    #> Unit: milliseconds
    #>                                      expr        min         lq      mean
    #>                               cor_df5_c()   1.098219   1.148988   2.06762
    #>  ulapply2(1:100, function(x) cor_df5_c()) 228.250897 233.052303 256.56849
    #>                             cor_df_v(100) 263.158918 272.988449 312.43802
    #>      median         uq      max neval cld
    #>    1.864515   2.030201  13.2325   100 a  
    #>  238.665277 247.184900 410.0768   100  b 
    #>  278.648158 291.164035 581.2834   100   c
    ```
    
    Further improvements can be achieved using parallelisation (for example via `parallel::parLapply()`)