Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tessa Week 3 #8

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 78 additions & 2 deletions 01_api-funct-iter.Rmd → 01_api-funct-iter-TessaBirt.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -70,15 +70,19 @@ Hooray, you have now successfully pulled in an online data set using an API!
**Using the code above as a starting point, pull in monthly NPS-wide visitation data for the years 1980, 1999, and 2018.**

```{r}
raw_data_1980 <- httr::GET(url = "https://irmaservices.nps.gov/v3/rest/stats/total/1980")

raw_data_1999 <- httr::GET(url = "https://irmaservices.nps.gov/v3/rest/stats/total/1999")

raw_data_2018 <- httr::GET(url = "https://irmaservices.nps.gov/v3/rest/stats/total/2018")
```

### Exercise #2 {style="color: maroon"}

**Now, let's explore the second NPS visitation data set, [visitation](https://irmaservices.nps.gov/v3/rest/stats/help/operations/FetchVisitation). This call pulls in monthly data for a specific park, across a specific time frame. Use your new API skills to pull in visitation data for Rocky Mountain National Park from 2010 through 2021, based on the API's URL template. The unit code for Rocky Mountain National Park is ROMO. (Hint: an API URL can have multiple sections that need to be updated by the user; this one requires the starting month and year, the ending month and year, and the park unit of interest.)**

```{r}

raw_data_2010through2021 <- httr::GET(url = "https://irmaservices.nps.gov/v3/rest/stats/visitation?unitCodes=ROMO&startMonth=01&startYear=2010&endMonth=12&endYear=2021")
```

# Functions
Expand Down Expand Up @@ -154,7 +158,26 @@ pull_1992 <- parkwide_visitation(year = 1992)
**Create a function called `unit_visitation()` that pulls park-specific visitation data for any park, across any time frame. For a list of all park codes, download [this spreadsheet](https://www.nps.gov/aboutus/foia/upload/NPS-Unit-List.xlsx). (Hint 1: functions can have multiple arguments. For this step, you will want arguments representing the start and end month and year, and park unit). Hint 2: Exercise 2 should be used as a starting point for making this function.)**

```{r}
unit_visitation <- function(unit, start_month, start_year, end_month, end_year){
url <- httr::GET(url = paste0("https://irmaservices.nps.gov/v3/rest/stats/visitation?unitCodes=", unit,
"&startMonth=", start_month,
"&startYear=", start_year,
"&endMonth=", end_month,
"&endYear=", end_year))

data_messy <- httr::content(url, as = "text", encoding = "UTF-8")

data_clean <- jsonlite::fromJSON(data_messy)

return(data_clean)

}

test <- unit_visitation(unit = "ROMO",
start_month = "01",
start_year = "2010",
end_month = "12",
end_year = "2021")
```

### Exercise #4 {style="color: maroon"}
Expand All @@ -163,6 +186,23 @@ pull_1992 <- parkwide_visitation(year = 1992)

```{r}

ROMO <- unit_visitation(unit = "ROMO",
start_month = "01",
start_year = "1992",
end_month = "11",
end_year = "2021")

EVER <- unit_visitation(unit = "EVER",
start_month = "01",
start_year = "1992",
end_month = "11",
end_year = "2021")

THRO <- unit_visitation(unit = "THRO",
start_month = "01",
start_year = "1992",
end_month = "11",
end_year = "2021")
```

## Function Defaults
Expand Down Expand Up @@ -198,7 +238,32 @@ parkwide_visitation(year = "1992")
**For our `unit_visitation()` function, make the default arguments for the start and end months January and December, respectively. This way, we are automatically pulling in data for an entire year. Then, rerun the updated 'unit_visitation()' function for ROMO, EVER, and THRO for the 1980-2021 time period to make sure it works properly.**

```{r}
unit_visitation <- function(unit, start_month = "01", start_year, end_month = "12", end_year){
url <- httr::GET(url = paste0("https://irmaservices.nps.gov/v3/rest/stats/visitation?unitCodes=", unit,
"&startMonth=", start_month,
"&startYear=", start_year,
"&endMonth=", end_month,
"&endYear=", end_year))

data_messy <- httr::content(url, as = "text", encoding = "UTF-8")

data_clean <- jsonlite::fromJSON(data_messy)

return(data_clean)

}

ROMO <- unit_visitation(unit = "ROMO",
start_year = "1992",
end_year = "2021")

EVER <- unit_visitation(unit = "EVER",
start_year = "1992",
end_year = "2021")

THRO <- unit_visitation(unit = "THRO",
start_year = "1992",
end_year = "2021")
```

# Iterations
Expand Down Expand Up @@ -251,7 +316,16 @@ multi_years <- dplyr::bind_rows(output_floop)
**Use a for loop to run `unit_visitation()` with arguments `start_year = 1980` and `end_year = 2021` across ROMO, EVER, and THRO. Then, create a single data frame containing each park units' output. (Hint: Your first step will be to create a vector listing each park unit.)**

```{r}
parks <- c("ROMO", "EVER", "THRO")

output_floop <- vector("list", length = length(parks))

for(i in 1:length(parks)){

output_floop[[i]] <-
unit_visitation(unit = parks[i], start_year = 1980, end_year = 2021)

}
```

## Mapping
Expand Down Expand Up @@ -282,5 +356,7 @@ multi_years <- bind_rows(output_map)
**Use `map()` to run `unit_visitation()` with arguments `start_year = 1980` and `end_year = 2021` across ROMO, EVER, and THRO. Then, create a single data frame containing each park units' output.**

```{r}

output_map <- parks %>%
map(~ unit_visitation(unit = ., start_year = 1980, end_year = 2021)) %>%
bind_rows()
```
55 changes: 52 additions & 3 deletions 02_data-wrangling.Rmd → 02_data-wrangling-TessaBirt.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,10 @@ parkwide <- years %>%
**Using the `unit_visitation()` function from the last lesson and mapping, pull visitor data from 1980-2021 for the following park units: ROMO, ACAD, LAKE, YELL, GRCA, ZION, OLYM, and GRSM. Name the final output `units`.**

```{r}

parks2 <- c("ROMO", "ACAD", "LAKE", "YELL", "GRCA", "ZION", "OLYM", "GRSM")
units <- parks2 %>%
map(~ unit_visitation(unit = ., start_year = 1980, end_year = 2021)) %>%
bind_rows()
```

## Exploring our data
Expand Down Expand Up @@ -128,7 +131,17 @@ plotly::ggplotly(
**Create an interactive graph with two separate panes: one showing park-wide visitation, the other showing all the individual park units together. Both panes should have different y-axes.**

```{r}
annual_visitation <- annual_visitation %>%
mutate(groups = ifelse(UnitCode == "Parkwide", "Park wide", "Parks"))

plotly::ggplotly(
ggplot(data=annual_visitation) +
geom_point(aes(x = Year, y = RecVisitation, color = UnitCode)) +
geom_path(aes(x = Year, y = RecVisitation, color = UnitCode)) +
scale_y_continuous(labels = scales::label_scientific()) +
facet_wrap(~groups, scales = "free_y") +
theme_bw(base_size=10)
)
```

It is pretty clear that some park units get orders of magnitude more visitors than others. But just how much of the total park visitation do each of these parks account for from year to year? Here we walk through two methods to tackle this question, ***pivoting*** and ***joining***, to get park unit visitation side-by-side with park-wide data.
Expand Down Expand Up @@ -161,7 +174,24 @@ long_data <- wide_data %>%
**Using `wide_data` as the starting point, create an interactive time series plot showing the annual percentage of the total visitation made up by all park units. In other words, a visual that allows us to see how much each park unit contributes to the total park visitation across the NPS system.**

```{r}

visitation_percentage <- wide_data %>%
mutate_at(.vars = c("ACAD", "ROMO", "GRCA", "LAKE", "YELL", "OLYM", "ZION", "GRSM", "Parkwide"), .fun = ~(. / Parkwide *100))

visitation_percentage_long <- visitation_percentage %>%
pivot_longer(cols = -Year,
names_to = "Park",
values_to = "RecVisitation")

visitation_percentage_long <- visitation_percentage_long %>%
filter(Park != "Parkwide")

plotly::ggplotly(
ggplot(data=visitation_percentage_long) +
geom_point(aes(x = Year, y = RecVisitation, color = Park)) +
geom_path(aes(x = Year, y = RecVisitation, color = Park)) +
scale_y_continuous(labels = scales::label_scientific()) +
theme_bw(base_size=10)
)
```

## Joining
Expand All @@ -179,13 +209,32 @@ joined_data <- inner_join(x = units, y = parkwide, by = c("Year","Month"))
**Using `joined_data` as the starting point, create an interactive time series plot showing the annual percentage of the total visitation made up by all park units. This plot should look nearly identical to the previous plot.**

```{r}
joined_annual_visitation1 <- joined_data %>%
group_by(Year, UnitCode.x, UnitCode.y) %>%
summarise(RecreationVisitors.x = sum(RecreationVisitors.x), RecreationVisitors.y = sum(RecreationVisitors.y))

joined_annual_visitation <- joined_annual_visitation1 %>%
mutate_at(.var = c("RecreationVisitors.x", "RecreationVisitors.y"), .funs = ~(. / RecreationVisitors.y *100)) %>%
select(Year, UnitCode.x, RecreationVisitors.x)

plotly::ggplotly(
ggplot(data=joined_annual_visitation) +
geom_point(aes(x = Year, y = RecreationVisitors.x, color = UnitCode.x)) +
geom_path(aes(x = Year, y = RecreationVisitors.x, color = UnitCode.x)) +
scale_y_continuous(labels = scales::label_scientific()) +
theme_bw(base_size=10)
)
```

### Exercise #5 {style="color: maroon"}

**Which park on average has the most visitation? Which park has the least visitation? Base your response on the data starting in 1990, ending in 2021. Defend your answer with numbers!**

```{r}

units1 <- units %>%
filter(Year >= 1990) %>%
group_by(UnitCode) %>%
summarise(avg_visitation = mean(RecreationVisitors, na.rm = TRUE))
```

GRSM had the most visitation from 1990 to 2021 on average (825651.1) and ACAD had the least (219688.8).
16 changes: 16 additions & 0 deletions APIs-Functions-Iteration_TessaBirt.Rproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Version: 1.0

RestoreWorkspace: Default
SaveWorkspace: Default
AlwaysSaveHistory: Default

EnableCodeIndexing: Yes
UseSpacesForTab: Yes
NumSpacesForTab: 2
Encoding: UTF-8

RnwWeave: knitr
LaTeX: pdfLaTeX

AutoAppendNewline: Yes
StripTrailingWhitespace: Yes