Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
jirlong committed Mar 31, 2024
1 parent d2717b4 commit cfa5332
Show file tree
Hide file tree
Showing 8 changed files with 574 additions and 11 deletions.
Binary file modified .DS_Store
Binary file not shown.
55 changes: 51 additions & 4 deletions R23p_join_twdemo_ref.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,6 @@ date: "`r Sys.Date()`"
output: html_document
---



```{r include=FALSE}
knitr::opts_chunk$set(echo=T, cache = T, warning = F, message = FALSE, class.output = "output")
library(knitr)
Expand All @@ -24,6 +22,10 @@ library(tidyverse)

```{r read csv}
raw <- read_csv("data/opendata107Y030.csv") %>%
slice(-1) %>%
mutate(vname = str_c(site_id, village)) %>%
select(vname, everything())
# YOUR CODE SHOULD BE HERE
Expand Down Expand Up @@ -76,9 +78,12 @@ tidy_data <- raw %>%
```{r}
tidy_data <- raw %>%
pivot_longer(names_to = "key", values_to = "value", cols = 6:ncol(.)) %>%
# separate(key, into = c("married", "age", "gender"), sep = "_") %>% View
mutate(key = str_replace(key, "_age", "")) %>%
mutate(key = str_replace(key, "100up", "100_110")) %>%
mutate(key = str_replace(key, "15down", "0_15")) %>%
separate(key, into = c("married", "ageLower", "ageUpper", "gender"), sep = "_") %>%
mutate(value = as.numeric(value))
# YOUR CODE SHOULD BE HERE
Expand Down Expand Up @@ -119,8 +124,50 @@ raw %>%
```{r}
village_stat <- tidy_data %>%
mutate(ageLower = as.numeric(ageLower),
ageUpper = as.numeric(ageUpper)) %>%
filter(ageLower >= 20) %>%
group_by(vname) %>%
summarise(
legalPopulation = sum(value),
elderSum = sum(value[ageLower >=65]),
womenSum = sum(value[gender=="f"]),
marriedSum = sum(value[married %in% c("married", "widowed")])
) %>%
ungroup() %>%
mutate(
elderPerc = elderSum / legalPopulation,
marriedPerc = marriedSum / legalPopulation,
womenPerc = womenSum / legalPopulation
)
# YOUR CODE SHOULD BE HERE
town_stat <- village_stat %>%
left_join(raw %>% select(vname, site_id), by="vname") %>%
mutate(site_id = str_replace(site_id, "三民一|三民二", "三民區")) %>%
mutate(site_id = str_replace(site_id, "鳳山一|鳳山二", "鳳山區")) %>%
group_by(site_id) %>%
summarise(
legalPopulation = sum(legalPopulation),
elderSum = sum(elderSum),
marriedSum = sum(marriedSum),
womenSum = sum(womenSum)
) %>%
ungroup()
town_stat <- town_stat %>%
mutate(
elderPerc = elderSum / legalPopulation,
marriedPerc = marriedSum / legalPopulation,
womenPerc = womenSum / legalPopulation
)
# %>%
# ggplot() + aes(womenPerc, elderPerc) +
# geom_point(alpha=0.5) +
# geom_text(aes(label = site_id), family = "Heiti TC Light")
```

測試
Expand Down Expand Up @@ -173,7 +220,7 @@ town_stat %>%
首先,先讀取資料並重新命名每個變項。由於我們要連結公投資料和前面的內政部人口統計資料,所以要注意兩筆資料間是否有共通的key(資料庫稱為鍵值)。`town_stat`的是以`site_id`鄉鎮市區名為主鍵,所以公投資料這邊也產生一個同名的鄉鎮市區變項`site_id`

```{r}
ref10 <- read_csv("data/ref10.csv") %>%
ref10 <- read_csv("data/ref10.csv")
# YOUR CODE SHOULD BE HERE
Expand Down
7 changes: 4 additions & 3 deletions R25_tidy_temoral_features.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ R 裡面,常用的時間物件有 **`POSIXct`** 與 **`POSIXlt`** 兩種,我
knitr::opts_chunk$set(echo=T, cache = T, warning = F, message = FALSE, class.output = "output")
library(knitr)
library(tidyverse)
options(scipen = 999)
```

## char-to-timestamp
Expand Down Expand Up @@ -133,10 +134,10 @@ clean %>%
clean %>%
mutate(m = month(ptime)) %>%
count(m) %>%
ggplot() + aes(m, n) +
geom_col()
ggplot() + aes(x=m, n) +
geom_col() +
scale_x_continuous(breaks = 1:12)
```

## Freq-by-date (good)
Expand Down
11 changes: 7 additions & 4 deletions V01p_Learning_ggplot.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -406,6 +406,7 @@ NW %>%
### 使用gghighlight套件

```{r message=FALSE, warning=FALSE}
# install.packages("gghighlight")
library(gghighlight)
NW %>%
ggplot() + aes(year, Net_Worth, color = Category) +
Expand All @@ -427,7 +428,9 @@ NW %>%
gghighlight(Category %in% c("65-74", "35-44")) +
scale_color_manual(
limits=c("65-74", "35-44"), # original chart group
values=c("gold", "skyblue")) + # map to color
values=c("gold", "skyblue"),
na.value = "black"
) + # map to color
theme_minimal()
```

Expand All @@ -442,7 +445,7 @@ county %>%
ggplot() + aes(county, people_total, fill=group) %>%
geom_col() +
scale_fill_manual(values=c("highlight"="Khaki", "other"="lightgrey")) +
guides(fill="none") +
guides(fill="none") +
coord_flip() +
theme_minimal() + th
```
Expand All @@ -455,7 +458,7 @@ county %>%
ggplot() + aes(county, people_total) %>%
geom_col(fill="deeppink") +
gghighlight(county %in% c("新竹縣", "新竹市")) +
guides(fill="none") +
coord_flip() +
# guides(fill="none") +
coord_flip() +
theme_minimal() + th
```
80 changes: 80 additions & 0 deletions V21_Geospatial.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
```{r include=FALSE}
knitr::opts_chunk$set(cache = T, warning = F, message = F,
class.output = "output", out.width='100%',
fig.asp = 0.5, fig.align = 'center')
library(tidyverse)
```

# GEOSPATIAL {#geospatial}

地圖是一種用來展示地理空間信息的視覺化工具,可以幫助我們更好地了解和分析地理現象。常見的地圖種類通常可以分為兩類:區域圖和點位圖。

1. 區域圖(Choropleth Map)是通過將地理區域劃分為幾個區域,然後用不同的顏色、陰影或圖案等方式來表示這些區域的某種屬性或數量。這種地圖通常用於展示國家、省份、城市等區域的人口、經濟、地形、氣候等相關數據。區域圖能夠直觀地展示地理現象在不同區域之間的差異和變化,並有助於我們進行比較和分析。
2. 點位圖(Dot Density Map)則是通過在地圖上用點或符號來表示某種地理空間現象的分布或密度。例如,可以用紅點表示城市、綠點表示森林、藍點表示湖泊等等。這種地圖通常用於展示地理現象在空間上的分布和密度,並能夠直觀地展示相對密度和稀疏程度。

**區域圖的數據形式:**有兩種基本數據模型:向量(Vector)和網格(Raster)。

- 向量數據模型使用點、線、多邊形等基本要素來描述地理空間現象。例如,可以用一個線段來表示一條河流,用一個多邊形來表示一個國家或城市的邊界等。向量數據模型具有比較強的邏輯性和表達能力,特別適合描述較簡單的地理現象。
- 網格數據模型則是將地理空間區域劃分為一個個大小相等的格子,每個格子都有一個固定的數值,用來表示這個區域的某種屬性,例如溫度、濕度、高程等等。網格數據模型適合描述分布比較連續和具有變化的地理現象。

通常繪製地理資訊地圖的時候,會需要因應你要繪製的地域去下載地圖空間數據檔案(例如.shape或geojson檔等)。如台灣的就可以去[社會經濟資料服務平台 (moi.gov.tw)](https://segis.moi.gov.tw/STAT/Web/Platform/QueryInterface/STAT_QueryInterface.aspx?Type=1)下載。但也有一些套件內部就包含這些地理空間數據,例如下一節的例子rworldmap套件本身就有世界地圖。或者可以嘗試ggmap或rgooglemap等第三方服務(參考簡介:[Map Visualization in R · Data Science and R](https://mpmendespt.github.io/Map-visualization.html)

## World Map

```{r}
library(readxl)
library(rworldmap) # for drawing rworldmap
```

```{r}
rawdata <- read_excel("data/WORLD-MACHE_Gender_6.8.15.xls", "Sheet1", col_names=T)
mapdata <- rawdata[,c(3, 6:24)]
```

### Bind data to map data

這段程式碼是在將自己的數據**`mapdata`****`rworldmap`**世界地圖數據進行結合。

首先,使用 **`joinCountryData2Map()`** 函數,將自己的數據和世界地圖數據按照國家的 ISO3 代碼進行連接,生成一張新的地圖。其中, **`mapdata`** 是指世界地圖數據, **`joinCode`** 參數指定連接時使用的 ISO3 代碼(亦即你預先知道你自己的資料中有ISO3國家代碼)。 **`nameJoinColumn`** 參數則用於指定自己數據中與國家對應的欄位名稱為**`iso3`**

還有其他的**`joinCode`**如「"ISO2","ISO3","FIPS","NAME", "UN" = numeric codes」等可參見該套件的說明[rworldmap package - RDocumentation](https://www.rdocumentation.org/packages/rworldmap/versions/1.3-6)

```{r}
# join your data with the world map data
myMap <- joinCountryData2Map(mapdata, joinCode = "ISO3", nameJoinColumn = "iso3")
myMap$matleave_13
```

### Drawing Map

**`mapCountryData()`** 函數用於將數據繪製在地圖上。其中, **`myMap`** 是已經連接過的世界地圖數據和自己的數據,包含了各國的地理空間信息和相關的數據資訊。 **`nameColumnToPlot`** 指定要顯示在地圖上的數據欄位為**`matleave_13`**,也就是 2013 年的產假長度。 **`catMethod`** 參數是決定視覺化時的數據分類是類別或連續,**`categorical`**表示將數據分成幾個等級來展示在地圖上。

```{r}
mapCountryData(myMap
, nameColumnToPlot="matleave_13"
, catMethod = "categorical"
)
```

### Drawing map by specific colors

```{r}
# self-defined colors
colors <- c("#FF8000", "#A9D0F5", "#58ACFA", "#0080FF", "#084B8A")
mapCountryData(myMap
, nameColumnToPlot="matleave_13"
, catMethod = "categorical"
, colourPalette = colors
, addLegend="FALSE"
)
```

::: practice
### Practice. Drawing map for every years

1. 繪製自1995至2013年每年的地圖並觀察其上的變化。
2. 繪製的時候請嘗試使用`par()`來把每年的地圖繪製在同一張圖上,怎麼做?
3. 你能觀察出變化來嗎?可否透過顏色的調整來凸顯變化?你的策略是什麼?
:::
Loading

0 comments on commit cfa5332

Please sign in to comment.