-
Notifications
You must be signed in to change notification settings - Fork 0
/
Lecture 9 Writing functions.Rmd
125 lines (100 loc) · 2.83 KB
/
Lecture 9 Writing functions.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
---
title: "Lecture 9 Writing functions"
author: "David A McAllister"
date: "Monday 31st October, 2016"
output:
slidy_presentation: default
---
## Objective of section
* To see how a function is just like interactive R work
## "Don't repeat yourself"
### Andy Hunt and Dave Thomas
## Use functions to avoid repeating yourself
* Reduces chances of error
* Makes code easier to change
* Makes code more readable (once you get your eye in)
* Makes code more transferrable
## Problem
Matrix 10 x 5. 10 hospitals and 5 timepoints.
Do any of the hospitals/timepoints have a high mortality?
Expected mortality is 100.
Want to do a statistical test on each cell.
## Our data
```{r}
set.seed(1234)
hospitals <- matrix(rpois(50,120), nrow = 10,
dimnames = list(paste("hosp", 1:10), paste("t", 1:5)))
hospitals[cbind (c(2,3,4),c(1,2,3))] <- NA
hospitals
```
## Simplify the problem
```{r}
hospital <- hospitals[1,1, drop = FALSE]
```
## Work interactively
```{r}
# poisson.test (x = hospital, r = 100)
poisson.test (x = as.vector(hospital), r = 100)
```
## ?poisson.test
Value
A list with class "htest" containing the following components:
component | explanation
---------- | ------------
statistic | the number of events (in the first sample if there are two.)
parameter | the corresponding expected count
p.value | the p-value of the test.
conf.int | a confidence interval for the rate or rate ratio.
estimate | the estimated rate or rate ratio.
null.value | the rate or rate ratio under the null, r.
alternative | a character string describing the alternative hypothesis.
method | the character string "Exact Poisson test" or "Comparison of Poisson rates" as appropriate.
data.name | a character string giving the names of the data.
## So using subsetting
```{r}
poisson.test (x = as.vector(hospital), r = 100)$p.value
```
## Wrap it up in a function
``` {r, warning=TRUE}
MyPois <- function (x, ...){
x <- as.vector(x)
poisson.test (x, r = 100)$p.value
}
# test it
MyPois(hospitals[1,1])
MyPois(hospitals[1,2])
# MyPois(hospitals[2,1])
```
## Re-write the function
``` {r}
MyPois <- function (x, ...){
x <- as.vector(x)
if (is.na(x)) return (NA)
poisson.test (x, r = 100)$p.value
}
# test it again
MyPois(hospitals[1,1])
MyPois(hospitals[1,2])
MyPois(hospitals[2,1])
```
## Apply it to the dataset
```{r}
hosp_p_values <- apply (hospitals, 1:2, MyPois)
hosp_p_values
```
## Identify True/False for the test
```{r}
hosp_diff <- hosp_p_values < 0.05
hosp_diff
```
## This has pulled together
- Knowledge of
+ object types
+ indices and subsetting
+ how functions work
## Function gotchas
* Objects don't print unless you use `print` function
* The last value is returned (by default)
* Function may apply to a vector or a scalar
* Changing object types cause problems
* more advanced, environment function created can matter