-
Notifications
You must be signed in to change notification settings - Fork 0
/
tibble_subsetting.Rmd
94 lines (84 loc) · 2.08 KB
/
tibble_subsetting.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---
title: "tibble subsetting"
author: "KevinO'Brien"
date: "22 September 2018"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(tibble)
```
### Subsetting
Tibbles are quite strict about subsetting. [ always returns another tibble. Contrast this with a data frame: sometimes [ returns a data frame and sometimes it just returns a vector:
```{r}
df1 <- data.frame(x = 1:3, y = 3:1)
class(df1[, 1:2])
#> [1] "data.frame"
class(df1[, 1])
#> [1] "integer"
```
```{r}
df2 <- tibble(x = 1:3, y = 3:1)
class(df2[, 1:2])
#> [1] "tbl_df" "tbl" "data.frame"
class(df2[, 1])
#> [1] "tbl_df" "tbl" "data.frame"
```
To extract a single column use [[ or $:
```{r}
class(df2[[1]])
#> [1] "integer"
class(df2$x)
#> [1] "integer"
```
Tibbles are also stricter with $. Tibbles never do partial matching, and will throw a warning and return NULL if the column does not exist:
```{r}
df <- data.frame(abc = 1)
df$a
#> [1] 1
df2 <- tibble(abc = 1)
df2$a
#> Warning: Unknown or uninitialised column: 'a'.
#> NULL
```
As of version 1.4.1, tibbles no longer ignore the drop argument:
```{r}
data.frame(a = 1:3)[, "a", drop = TRUE]
#> [1] 1 2 3
tibble(a = 1:3)[, "a", drop = TRUE]
#> [1] 1 2 3
```
### Recycling
When constructing a tibble, only values of length 1 are recycled. The first column with length different to one determines the number of rows in the tibble, conflicts lead to an error. This also extends to tibbles with zero rows, which is sometimes important for programming:
```{r}
tibble(a = 1, b = 1:3)
#> # A tibble: 3 x 2
#> a b
#> <dbl> <int>
#> 1 1.00 1
#> 2 1.00 2
#> 3 1.00 3
```
```{r}
tibble(a = 1:3, b = 1)
#> # A tibble: 3 x 2
#> a b
#> <int> <dbl>
#> 1 1 1.00
#> 2 2 1.00
#> 3 3 1.00
```
<pre><code>
tibble(a = 1:3, c = 1:2)
#> Error: Column `c` must be length 1 or 3, not 2
</code></pre>
```{r}
tibble(a = 1, b = integer())
#> # A tibble: 0 x 2
#> # ... with 2 variables: a <dbl>, b <int>
```
```{r}
tibble(a = integer(), b = 1)
#> # A tibble: 0 x 2
#> # ... with 2 variables: a <int>, b <dbl>
```