Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
jirlong committed Apr 8, 2024
1 parent b325318 commit 88edff9
Show file tree
Hide file tree
Showing 49 changed files with 3,548 additions and 3,116 deletions.
24 changes: 15 additions & 9 deletions 404.html
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
<meta name="author" content="HSIEH, JI-LUNG" />


<meta name="date" content="2024-03-31" />
<meta name="date" content="2024-04-08" />

<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="apple-mobile-web-app-capable" content="yes" />
Expand Down Expand Up @@ -310,17 +310,23 @@
</ul></li>
<li class="chapter" data-level="7" data-path="joindata.html"><a href="joindata.html"><i class="fa fa-check"></i><b>7</b> Data manipultaiton: Join data</a>
<ul>
<li class="chapter" data-level="7.1" data-path="joindata.html"><a href="joindata.html#moi"><i class="fa fa-check"></i><b>7.1</b> 讀取內政部人口統計資料</a>
<li class="chapter" data-level="7.1" data-path="joindata.html"><a href="joindata.html#simple"><i class="fa fa-check"></i><b>7.1</b> A Simple Example: Joining Two Data Frames</a>
<ul>
<li class="chapter" data-level="7.1.1" data-path="joindata.html"><a href="joindata.html#moi_plan"><i class="fa fa-check"></i><b>7.1.1</b> 分析規劃</a></li>
<li class="chapter" data-level="7.1.2" data-path="joindata.html"><a href="joindata.html#moi_clean"><i class="fa fa-check"></i><b>7.1.2</b> 清理資料</a></li>
<li class="chapter" data-level="7.1.3" data-path="joindata.html"><a href="joindata.html#moi_rowwise"><i class="fa fa-check"></i><b>7.1.3</b> 進階:運用<code>rowwise()</code></a></li>
<li class="chapter" data-level="7.1.4" data-path="joindata.html"><a href="joindata.html#moi_vil"><i class="fa fa-check"></i><b>7.1.4</b> 建立鄉鎮市區與村里指標</a></li>
<li class="chapter" data-level="7.1.5" data-path="joindata.html"><a href="joindata.html#moi_visual_popul"><i class="fa fa-check"></i><b>7.1.5</b> 視覺化測試(老年人口數 x 曾婚人口數)</a></li>
<li class="chapter" data-level="7.1.1" data-path="joindata.html"><a href="joindata.html#left_join-right_join"><i class="fa fa-check"></i><b>7.1.1</b> <code>left_join()</code> &amp; <code>right_join()</code></a></li>
<li class="chapter" data-level="7.1.2" data-path="joindata.html"><a href="joindata.html#inner_join-and-full_join"><i class="fa fa-check"></i><b>7.1.2</b> <code>inner_join()</code> and <code>full_join()</code></a></li>
<li class="chapter" data-level="7.1.3" data-path="joindata.html"><a href="joindata.html#join-by-different-keys"><i class="fa fa-check"></i><b>7.1.3</b> <code>join()</code> by different keys</a></li>
</ul></li>
<li class="chapter" data-level="7.2" data-path="joindata.html"><a href="joindata.html#referendum"><i class="fa fa-check"></i><b>7.2</b> 讀取公投資料</a>
<li class="chapter" data-level="7.2" data-path="joindata.html"><a href="joindata.html#moi"><i class="fa fa-check"></i><b>7.2</b> 讀取內政部人口統計資料</a>
<ul>
<li class="chapter" data-level="7.2.1" data-path="joindata.html"><a href="joindata.html#moi_join_ref"><i class="fa fa-check"></i><b>7.2.1</b> 合併公投資料並視覺化</a></li>
<li class="chapter" data-level="7.2.1" data-path="joindata.html"><a href="joindata.html#moi_plan"><i class="fa fa-check"></i><b>7.2.1</b> 分析規劃</a></li>
<li class="chapter" data-level="7.2.2" data-path="joindata.html"><a href="joindata.html#moi_clean"><i class="fa fa-check"></i><b>7.2.2</b> 清理資料</a></li>
<li class="chapter" data-level="7.2.3" data-path="joindata.html"><a href="joindata.html#moi_rowwise"><i class="fa fa-check"></i><b>7.2.3</b> 進階:運用<code>rowwise()</code></a></li>
<li class="chapter" data-level="7.2.4" data-path="joindata.html"><a href="joindata.html#moi_vil"><i class="fa fa-check"></i><b>7.2.4</b> 建立鄉鎮市區與村里指標</a></li>
<li class="chapter" data-level="7.2.5" data-path="joindata.html"><a href="joindata.html#moi_visual_popul"><i class="fa fa-check"></i><b>7.2.5</b> 視覺化測試(老年人口數 x 曾婚人口數)</a></li>
</ul></li>
<li class="chapter" data-level="7.3" data-path="joindata.html"><a href="joindata.html#referendum"><i class="fa fa-check"></i><b>7.3</b> 讀取公投資料</a>
<ul>
<li class="chapter" data-level="7.3.1" data-path="joindata.html"><a href="joindata.html#moi_join_ref"><i class="fa fa-check"></i><b>7.3.1</b> 合併公投資料並視覺化</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="8" data-path="categorical.html"><a href="categorical.html"><i class="fa fa-check"></i><b>8</b> Categorical Data Analysis</a>
Expand Down
127 changes: 125 additions & 2 deletions R23_join_twdemo_ref.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,129 @@

# Data manipultaiton: Join data {#joindata}

在資料處理與分析領域中,我們常常需要合併不同來源的資料,例如合併公投資料和內政部的人口統計資料、例如合併健保資料和長照資料、家戶資料、財稅局的所得資料等。這種合併資料的操作的語法源生於資料庫管理系統,而其中一種常見的操作就是使用不同類型的 join 來合併資料表。在資料庫中,當我們需要結合兩個或多個資料表時,我們通常會使用像是 left join、right join、inner join 和 full join 等不同類型的 join 操作。這些 join 操作可以根據指定的欄位值將兩個資料表中的資料按照不同的方式進行合併,以滿足具體的分析需求。

## A Simple Example: Joining Two Data Frames {#simple}

以下為用來解釋 join 的簡單範例。其中 **`posts`****`pid`**(貼文編號)和 **`pcontent`**(貼文內容)兩個欄位;而 **`comments`** 包含了留言的資訊,有 **`pid`**(所屬的貼文編號)和 **`ccontent`**(留言內容)兩個欄位。


```r
posts <- tibble(pid=c("p01", "p02", "p03"), pcontent=c("post01", "post02", "post03"))
comments <- tibble(pid=c("p02", "p03", "p04"), ccontent=c("comment01", "comment02", "comment03"))
posts
```

```{.output}
## # A tibble: 3 × 2
## pid pcontent
## <chr> <chr>
## 1 p01 post01
## 2 p02 post02
## 3 p03 post03
```

```r
comments
```

```{.output}
## # A tibble: 3 × 2
## pid ccontent
## <chr> <chr>
## 1 p02 comment01
## 2 p03 comment02
## 3 p04 comment03
```

### `left_join()` & `right_join()`

- `left_join()`會將兩個表格中的資料以左邊表格為主,並將右邊表格中符合左邊表格的資料進行合併。如果右邊表格中沒有符合的資料,則以 `NA`值填充。 在下面的範例中,我們將`posts`作為左表,`comments`作為右表,以`pid`欄位為連接依據,所以會返回`posts`表格中的所有資料,並將符合的`comments`資料合併進來。
- `right_join()``left_join()`相反,它將兩個表格中的資料以右表為主,並將左邊表格中符合右表的資料進行合併。如果左表中沒有符合的資料,則以`NA`值填充。


```r
left_join(posts, comments)
```

```{.output}
## # A tibble: 3 × 3
## pid pcontent ccontent
## <chr> <chr> <chr>
## 1 p01 post01 <NA>
## 2 p02 post02 comment01
## 3 p03 post03 comment02
```

```r
right_join(posts, comments)
```

```{.output}
## # A tibble: 3 × 3
## pid pcontent ccontent
## <chr> <chr> <chr>
## 1 p02 post02 comment01
## 2 p03 post03 comment02
## 3 p04 <NA> comment03
```

```r
right_join(comments, posts)
```

```{.output}
## # A tibble: 3 × 3
## pid ccontent pcontent
## <chr> <chr> <chr>
## 1 p02 comment01 post02
## 2 p03 comment02 post03
## 3 p01 <NA> post01
```

### `inner_join()` and `full_join()`

- `inner_join()`會返回兩個表格中共同符合連接條件的資料。換句話說,它會保留左右兩表中都有對應資料的列。如果左右表格中有任一邊缺少符合的資料,則該資料將不會出現在結果中。
- `full_join()`會返回兩個表格中所有的資料,並將缺失的值以`NA`填充。如果兩表中有共同符合的資料,則會將它們合併在一起;如果某個表格中有但另一個表格中沒有的資料,則會將缺失的資料補上`NA`


```r
inner_join(posts, comments)
```

```{.output}
## # A tibble: 2 × 3
## pid pcontent ccontent
## <chr> <chr> <chr>
## 1 p02 post02 comment01
## 2 p03 post03 comment02
```

```r
full_join(posts, comments)
```

```{.output}
## # A tibble: 4 × 3
## pid pcontent ccontent
## <chr> <chr> <chr>
## 1 p01 post01 <NA>
## 2 p02 post02 comment01
## 3 p03 post03 comment02
## 4 p04 <NA> comment03
```

### `join()` by different keys

我們將 **`posts`** 作為左表,**`comments`** 作為右表,並使用 **`postid`****`pid`** 欄位進行連接。由於我們經由觀察知道左邊表格中的 **`postid`** 和右邊表格中的 **`pid`** 有對應關係,所以它們會根據這個關係進行連接,語法為`by=c("postid"="pid")`,指定左表的`postid`和右表的`pid`相對應。


```r
posts <- tibble(postid=c("p01", "p02", "p03"), pcontent=c("post01", "post02", "post03"))
comments <- tibble(pid=c("p02", "p03", "p04"), ccontent=c("comment01", "comment02", "comment03"))
result <- left_join(posts, comments, by = c("postid" = "pid"))
```

## 讀取內政部人口統計資料 {#moi}

先使用`slice(-1)`減去第一行中文欄位名稱。再來,目前縣市鄉鎮市區(`site_id`)和村里(`village`)分別是兩個變項,由於不同的鄉鎮市可能會有相同的村里名,所以把`site_id``village`粘接起來成為完整的村里名`vname`
Expand Down Expand Up @@ -209,7 +332,7 @@ town_stat %>%
geom_text(aes(label=site_id, vjust=1.3, size=4), family = "Heiti TC Light") + th
```

<img src="R23_join_twdemo_ref_files/figure-html/unnamed-chunk-8-1.png" width="672" />
<img src="R23_join_twdemo_ref_files/figure-html/unnamed-chunk-12-1.png" width="672" />

```r
# geom_jitter(alpha = 0.3)
Expand Down Expand Up @@ -287,4 +410,4 @@ town_stat %>%
theme_light()
```

<img src="R23_join_twdemo_ref_files/figure-html/unnamed-chunk-11-1.png" width="672" />
<img src="R23_join_twdemo_ref_files/figure-html/unnamed-chunk-15-1.png" width="672" />
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion R41_Crawler.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,4 +110,4 @@ Chrome DevTools是一款由Google開發的網頁開發工具,可以幫助開

6. 點選JSON請求,您可以查看Request和Response中的的詳細信息,包括URL、Headers、Request Payload和Response等。

7. 在Response分頁中,您可以看到JSON數據的內容。如果JSON數據很大,您可以右鍵點擊JSON數據,然後選擇「Save Response As\...」將其保存到本地檔案中。
7. 在Response分頁中,您可以看到JSON數據的內容。如果JSON數據很大,您可以右鍵點擊JSON數據,然後選擇「Save Response As...」將其保存到本地檔案中。
Binary file modified V01_Learning_ggplot_files/figure-html/unnamed-chunk-25-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified V01_Learning_ggplot_files/figure-html/unnamed-chunk-26-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified Z2_Exploring_data_Visually_files/figure-html/eda-boxplot-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 88edff9

Please sign in to comment.