forked from NBISweden/workshop-r
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathlab_loops.Rmd
304 lines (242 loc) · 8.95 KB
/
lab_loops.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
---
title: "Loops in R"
subtitle: "R Programming Foundation for Life Scientists"
output:
bookdown::html_document2:
highlight: textmate
toc: true
toc_float:
collapsed: true
smooth_scroll: true
print: false
toc_depth: 4
number_sections: true
df_print: default
code_folding: none
self_contained: false
keep_md: false
encoding: 'UTF-8'
css: "assets/lab.css"
include:
after_body: assets/footer-lab.html
---
```{r,child="assets/header-lab.Rmd"}
```
# Introduction
In programming languages loop structures, either with or without conditions, are used to repeat commands over multiple entities. For
and while loops as well as if-else statements are also often used in R, but not as often as in many other programming languages. The reason for this is that many needs of the loops are addressed using vectorization or via apply functions.
This means that we can multiply all values in a vector in R by two by calling
```{r}
vec.a <- c(1, 2, 3, 4)
vec.a * 2
```
In many other and languages as well as in R, you can also create this with a loop instead
```{r}
for (i in vec.a) {
vec.a[i] <- vec.a[i] * 2
}
vec.a
```
This is far less efficient and not by any means easier to type and we hence tend to avoid loops when possible.
<details>
<summary><strong>Extra: Demonstration of time lapse comparison between vectorization and loop</strong></summary>
Let us compare the time of execution of the vectorized version (vector with 10,000 elements):
```{r for.loop.avoid.timing, echo=T}
vec <- c(1:1e6)
ptm <- proc.time()
vec <- vec + 1
proc.time() - ptm # vectorized
```
--
to the loop version:
```{r for.loop.avoid.timing2, echo=T}
vec <- c(1:1e6)
ptm <- proc.time()
for (i in vec) {
vec[i] <- vec[i] + 1
}
proc.time() - ptm # for-loop
```
</details>
</br>
After this exercise you should know:
- What are the most common loop structures in R
- Some common alternatives to using loops in R
- How one can convert a short script to a function.
- Use that new function in R.
# Exercises
1. Create a 100000 by 10 matrix with the numbers 1:1000000. Make a for-loop that calculates the sum for each row of the matrix. Verify that your results are consistent with what you obtain with the `apply()` function to calculate row sums as well as with the built-in `rowSums()` function. These functions were discussed in the lecture **Elements of the programming language - part 2**.
```{r,accordion=TRUE,results='markup'}
X <- matrix(1:1000000, nrow = 100000, ncol = 10)
for.sum <- vector()
# Note that this loop is much faster if you outside the loop create an empty vector of the right size.
# rwmeans <- vector('integer', 100000)
for (i in 1:nrow(X)) {
for.sum[i] <- sum(X[i,])
}
head(for.sum)
app.sum <- apply(X, MARGIN = 1, sum)
head(app.sum)
rowSums.sum <- rowSums(X)
head(rowSums.sum)
identical(for.sum, app.sum)
identical(for.sum, rowSums.sum)
identical(for.sum, as.integer(rowSums.sum))
```
2. As you saw in the lecture, another common loop structure that is used is the while loop, which functions much like a for loop, but will only run as long as a test condition is TRUE. Modify your for loop from exercise 1 and make it into a while loop.
<details>
<summary><strong>Tip?</strong></summary>
You can use the number of rows in the matrix as a condition for the while loop.
</details>
```{r,accordion=TRUE}
x <- 1
while.sum <- vector("integer", 100000)
while (x < 100000) {
while.sum[x] <- sum(X[x,])
x <- x + 1
}
head(while.sum)
```
3. Create a data frame with two numeric and one character vector. Write a loop that loops over the columns and reports the sum of the column values if it is numeric and the total number of characters if it is a character vector.
<details>
<summary><strong>Tip?</strong></summary>
To count number of characters, you can use `nchar` function.
</details>
- Do it first using only if clauses
```{r,accordion=TRUE}
vector1 <- 1:10
vector2 <- c("Odd", "Loop", letters[1:8])
vector3 <- rnorm(10, sd = 10)
dfr1 <- data.frame(vector1, vector2, vector3, stringsAsFactors = FALSE)
sum.vec <- vector()
for(i in 1:ncol(dfr1)) {
if (is.numeric(dfr1[,i])) {
sum.vec[i] <- sum(dfr1[,i])
}
if (is.character(dfr1[,i])) {
sum.vec[i] <- sum(nchar(dfr1[,i]))
}
}
sum.vec
```
- And then modify the loop to use if-else clauses
```{r, accordion=TRUE}
sum.vec <- vector()
for(i in 1:ncol(dfr1)) {
if (is.numeric(dfr1[,i])) {
sum.vec[i] <- sum(dfr1[,i])
} else {
sum.vec[i] <- sum(nchar(dfr1[,i]))
}
}
sum.vec
```
4. In question 3 you generated a loop to go over a data frame. Try to convert this code to a function in R. The function should take a single data frame name as argument.
```{r,accordion=TRUE}
dfr.info <- function(dfr) {
sum.vec <- vector()
for (i in 1:ncol(dfr)) {
if (is.numeric(dfr[,i])) {
sum.vec[i] <- sum(dfr[,i])
} else {
sum.vec[i] <- sum(nchar(dfr[,i]))
}
}
sum.vec
}
#Execute the function
dfr.info(dfr1)
```
5. **Extra exercise-loops**. A variation of exercise 3: Create a data frame with three columns: one numeric, one logical, and one character vector. Write a loop that loops over the columns and reports the sum of the column values if it is numeric, the total number of `TRUE`s when is logical and the total number of characters if it is a character vector.
<details>
<summary><strong>Tips?</strong></summary>
To count number of characters, you can use `nchar` function. <br/>
To count number of `TRUE` values, you can use `sum` function.<br/>
Inside the loop, you may want to use the if-else-if structure.<br/>
</details>
```{r,accordion=TRUE}
vector1 <- 1:10
vector2 <- c("Odd", "Loop", letters[1:8])
vector3 <- c(TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, FALSE, TRUE, TRUE, FALSE)
dfr2 <- data.frame(vector1, vector2, vector3, stringsAsFactors = FALSE)
sum.vec <- vector()
for(i in 1:ncol(dfr2)) {
if (is.numeric(dfr2[,i])) {
sum.vec[i] <- sum(dfr2[,i])
} else if (is.logical(dfr2[,i])) {
sum.vec[i] <- sum(dfr2[,i])
} else {
sum.vec[i]<-sum(nchar(dfr2[,i]))
}
}
sum.vec
```
6. **Extra exercise-functions**. Create a function that will convert time in hours to time in minutes. The function should take a single argument, a numeric value of hours, and return a numeric value of minutes. Add an if clause inside of the function that will stop running the function and give an error if the hours provided are negative.
<details>
<summary><strong>Tip?</strong></summary>
You can read the documentation of the `stop()` function by typing `?stop` in the R console.
</details>
```{r,accordion=TRUE, error=T}
hours_to_mins <- function(hours) {
if (hours < 0) {
stop("Hours cannot be negative")
}
minutes <- hours * 60
return(minutes)
}
hours_to_mins(3.2)
#Now test it with a negative hour value
hours_to_mins(-3.26)
```
6. In all loops that we tried out we have created the variable where the output is saved outside the loop. Why is this?
</br>
</br>
# Extra material
Do you want to expand on loops, if-else clauses, and functions? Here a bit more extra material!
<details>
<summary><strong>Extra material: The switch function</strong></summary>
If-else clauses operate on logical values. What if we want to take decisions based on non-logical values? Well, if-else will still work by evaluating a number of comparisons, but we can also use **switch**:
```{r switch, echo=T}
switch.demo <- function(x) {
switch(class(x),
logical = cat('logical\n'),
numeric = cat('Numeric\n'),
factor = cat('Factor\n'),
cat('Undefined\n')
)
}
switch.demo(x=TRUE)
switch.demo(x=15)
switch.demo(x=factor('a'))
switch.demo(data.frame())
```
</details>
<details>
<summary><strong>Extra material: The ellipsis argument trick</strong></summary>
What if the authors of, e.g. plot.something wrapper forgot about the `...`?
```{r fns.3dots.trick, echo=T, fig.height = 5, fig.width = 5}
my.plot <- function(x, y) { # Passing downstrem
plot(x, y, las=1, cex.axis=.8, ...)
}
formals(my.plot) <- c(formals(my.plot), alist(... = ))
my.plot(1, 1, col='red', pch=19)
```
</details>
<details>
<summary><strong>Extra material: Infix notations</strong></summary>
Operators like `+`, `-` or `*` are using the so-called **infix** functions, where the function name is between arguments. We can define our own:
```{r infix, echo=T}
`%p%` <- function(x, y) {
paste(x,y)
}
'a' %p% 'b'
```
</details>
<!--
7. <i class="fas fa-exclamation-circle"></i> **Advanced:** At the lecture an approach to calculate factorials were implemented using recursion (function calling itself). Here we instead will have a go at generating Fibonacci numbers. A fibonacci number is part of a series of number with the following properties:
The first two numbers in the Fibonacci sequence are either 1 and 1, or 0 and 1, depending on the chosen starting point of the sequence, and each subsequent number is the sum of the previous two. Hence:
`0, 1, 1, 2, 3, 5, 8, 13, 21, ...`
or
`1, 1, 2, 3, 5, 8, 13, 21, ...`
Try to generate such a series using a recursive approach.
-->