forked from rstudio-education/stat545
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path26_qualitative-colors.Rmd
121 lines (82 loc) · 5.52 KB
/
26_qualitative-colors.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
# Taking control of qualitative colors in ggplot {#qualitative-colors}
```{r include = FALSE}
source("common.R")
```
<!--Original content: https://stat545.com/block019_enforce-color-scheme.html-->
## Load packages and prepare the Gapminder data
Load the ggplot and dplyr packages and bring in the usual Gapminder data but drop Oceania, which only has two countries.
We also sort the country factor based on population and then sort the data as well. Why? In the bubble plots below, we don't want large countries to hide small countries. This is a case where, sadly, the row order of the data truly affects the visual output.
```{r message = FALSE, warning = FALSE}
library(ggplot2)
library(dplyr)
library(gapminder)
jdat <- gapminder %>%
filter(continent != "Oceania") %>%
droplevels() %>%
mutate(country = reorder(country, -1 * pop)) %>%
arrange(year, country)
```
## Take control of the size and color of points
Let's use ggplot to move towards the classic Gapminder bubble chart. Crawl then walk then run.
First, make a simple scatterplot for a single year.
```{r scatterplot}
j_year <- 2007
q <- jdat %>%
filter(year == j_year) %>%
ggplot(aes(x = gdpPercap, y = lifeExp)) +
scale_x_log10(limits = c(230, 63000))
q + geom_point()
```
Take control of the plotting symbol, its size, and its color. Use obnoxious settings so that success versus failure is completely obvious. Now is not the time for the delicate operation of inserting your fancy color scheme. Be bold!
```{r scatterplot-obnoxious-points}
## do I have control of size and fill color? YES!
q + geom_point(pch = 21, size = 8, fill = I("darkorchid1"))
```
## Circle area = population
We want the size of the circle to reflect population. I have two complaints with my first attempt: the circles are still too small for my taste and I don't want the size legend. So in my second attempt, I suppress the legend with `show.legend = FALSE` and I increase the range of sizes by explicitly setting the range for the scale that maps `pop` into circle size.
```{r scatterplot-population-area, fig.show='hold', out.width='50%'}
q + geom_point(aes(size = pop), pch = 21)
(r <- q +
geom_point(aes(size = pop), pch = 21, show.legend = FALSE) +
scale_size_continuous(range = c(1,40)))
```
## Circle fill color determined by a factor
Now I use `aes()` to map a factor to color. For the moment, I settle for the `continent` factor and for the automatic color scheme. I also facet by continent. Why? Because it will be helpful below for checking my progress on using my custom color scheme. Since all the countries, say, in Europe, are some shade of green, if the continent facets have circles of many colors, I'll know something's wrong.
```{r scatterplot-continent-fill, fig.show='hold', out.width='50%'}
(r <- r + facet_wrap(~ continent) + ylim(c(39, 87)))
r + aes(fill = continent)
```
## Get the color scheme for the countries
The gapminder package comes with color palettes for the continents and the individual countries. For example, here's the country color scheme:
```{r echo = FALSE, fig.cap = "From [https://github.com/jennybc/gapminder](https://github.com/jennybc/gapminder)"}
knitr::include_graphics("img/gapminder-color-scheme-ggplot2.png")
```
```{r}
str(country_colors)
head(country_colors)
```
`country_colors` is a named character vector, with one element per country, holding the RGB hex strings encoding the color scheme.
__Note:__ The order of `country_colors` is not alphabetical. The countries are actually sorted by size (in which particular year, I don't recall) within continent, reflecting the logic by which the scheme was created. No problem. Ideally, nothing in your analysis should depend on row order, although that's not always possible in reality.
## Prepare the color scheme for use with ggplot
In a [grammar of graphics](https://vita.had.co.nz/papers/layered-grammar.html), a __scale__ controls the mapping from a variable in the data to an aesthetic [@wickham2010]. So far we've let the coloring / filling scale be determined automatically by `ggplot2.` But to use our custom color scheme, we need to take control of the mapping of the `country` factor into fill color in `geom_point()`.
We will use `scale_fill_manual()`, a member of a family of functions for customization of the discrete scales. The main argument is `values =`, which is a vector of aesthetic values -- fill colors, in our case. If this vector has names, they will be consulted during the mapping. This is incredibly useful! This is why `country_colors` does **exactly that**. This saves us from any worry about the order of levels of the `country` factor, the row order of the data, or exactly which countries are being plotted.
## Make the ggplot bubble chart
This is deceptively simple at this point. Like many things, it looks really easy, once we figure everything out! The last two bits we add are to use `aes()` to specify that the country should be mapped to color and to use `scale_fill_manual()` to specify our custom color scheme.
```{r scatterplot-country-fill}
r + aes(fill = country) + scale_fill_manual(values = country_colors)
```
## All together now
The complete code to make the plot.
```{r scatterplot-country-fill-final}
j_year <- 2007
jdat %>%
filter(year == j_year) %>%
ggplot(aes(x = gdpPercap, y = lifeExp, fill = country)) +
scale_fill_manual(values = country_colors) +
facet_wrap(~ continent) +
geom_point(aes(size = pop), pch = 21, show.legend = FALSE) +
scale_x_log10(limits = c(230, 63000)) +
scale_size_continuous(range = c(1,40)) + ylim(c(39, 87))
```
```{r links, child="links.md"}
```