diff --git a/01_api-funct-iter.Rmd b/01_api-funct-iter-TessaBirt.Rmd similarity index 83% rename from 01_api-funct-iter.Rmd rename to 01_api-funct-iter-TessaBirt.Rmd index 32422fb..23cba72 100644 --- a/01_api-funct-iter.Rmd +++ b/01_api-funct-iter-TessaBirt.Rmd @@ -70,7 +70,11 @@ Hooray, you have now successfully pulled in an online data set using an API! **Using the code above as a starting point, pull in monthly NPS-wide visitation data for the years 1980, 1999, and 2018.** ```{r} +raw_data_1980 <- httr::GET(url = "https://irmaservices.nps.gov/v3/rest/stats/total/1980") +raw_data_1999 <- httr::GET(url = "https://irmaservices.nps.gov/v3/rest/stats/total/1999") + +raw_data_2018 <- httr::GET(url = "https://irmaservices.nps.gov/v3/rest/stats/total/2018") ``` ### Exercise #2 {style="color: maroon"} @@ -78,7 +82,7 @@ Hooray, you have now successfully pulled in an online data set using an API! **Now, let's explore the second NPS visitation data set, [visitation](https://irmaservices.nps.gov/v3/rest/stats/help/operations/FetchVisitation). This call pulls in monthly data for a specific park, across a specific time frame. Use your new API skills to pull in visitation data for Rocky Mountain National Park from 2010 through 2021, based on the API's URL template. The unit code for Rocky Mountain National Park is ROMO. (Hint: an API URL can have multiple sections that need to be updated by the user; this one requires the starting month and year, the ending month and year, and the park unit of interest.)** ```{r} - +raw_data_2010through2021 <- httr::GET(url = "https://irmaservices.nps.gov/v3/rest/stats/visitation?unitCodes=ROMO&startMonth=01&startYear=2010&endMonth=12&endYear=2021") ``` # Functions @@ -154,7 +158,26 @@ pull_1992 <- parkwide_visitation(year = 1992) **Create a function called `unit_visitation()` that pulls park-specific visitation data for any park, across any time frame. For a list of all park codes, download [this spreadsheet](https://www.nps.gov/aboutus/foia/upload/NPS-Unit-List.xlsx). (Hint 1: functions can have multiple arguments. For this step, you will want arguments representing the start and end month and year, and park unit). Hint 2: Exercise 2 should be used as a starting point for making this function.)** ```{r} +unit_visitation <- function(unit, start_month, start_year, end_month, end_year){ + url <- httr::GET(url = paste0("https://irmaservices.nps.gov/v3/rest/stats/visitation?unitCodes=", unit, + "&startMonth=", start_month, + "&startYear=", start_year, + "&endMonth=", end_month, + "&endYear=", end_year)) + +data_messy <- httr::content(url, as = "text", encoding = "UTF-8") + +data_clean <- jsonlite::fromJSON(data_messy) + +return(data_clean) + +} +test <- unit_visitation(unit = "ROMO", + start_month = "01", + start_year = "2010", + end_month = "12", + end_year = "2021") ``` ### Exercise #4 {style="color: maroon"} @@ -163,6 +186,23 @@ pull_1992 <- parkwide_visitation(year = 1992) ```{r} +ROMO <- unit_visitation(unit = "ROMO", + start_month = "01", + start_year = "1992", + end_month = "11", + end_year = "2021") + +EVER <- unit_visitation(unit = "EVER", + start_month = "01", + start_year = "1992", + end_month = "11", + end_year = "2021") + +THRO <- unit_visitation(unit = "THRO", + start_month = "01", + start_year = "1992", + end_month = "11", + end_year = "2021") ``` ## Function Defaults @@ -198,7 +238,32 @@ parkwide_visitation(year = "1992") **For our `unit_visitation()` function, make the default arguments for the start and end months January and December, respectively. This way, we are automatically pulling in data for an entire year. Then, rerun the updated 'unit_visitation()' function for ROMO, EVER, and THRO for the 1980-2021 time period to make sure it works properly.** ```{r} +unit_visitation <- function(unit, start_month = "01", start_year, end_month = "12", end_year){ + url <- httr::GET(url = paste0("https://irmaservices.nps.gov/v3/rest/stats/visitation?unitCodes=", unit, + "&startMonth=", start_month, + "&startYear=", start_year, + "&endMonth=", end_month, + "&endYear=", end_year)) + +data_messy <- httr::content(url, as = "text", encoding = "UTF-8") + +data_clean <- jsonlite::fromJSON(data_messy) + +return(data_clean) + +} + +ROMO <- unit_visitation(unit = "ROMO", + start_year = "1992", + end_year = "2021") +EVER <- unit_visitation(unit = "EVER", + start_year = "1992", + end_year = "2021") + +THRO <- unit_visitation(unit = "THRO", + start_year = "1992", + end_year = "2021") ``` # Iterations @@ -251,7 +316,16 @@ multi_years <- dplyr::bind_rows(output_floop) **Use a for loop to run `unit_visitation()` with arguments `start_year = 1980` and `end_year = 2021` across ROMO, EVER, and THRO. Then, create a single data frame containing each park units' output. (Hint: Your first step will be to create a vector listing each park unit.)** ```{r} +parks <- c("ROMO", "EVER", "THRO") + +output_floop <- vector("list", length = length(parks)) +for(i in 1:length(parks)){ + + output_floop[[i]] <- + unit_visitation(unit = parks[i], start_year = 1980, end_year = 2021) + +} ``` ## Mapping @@ -282,5 +356,7 @@ multi_years <- bind_rows(output_map) **Use `map()` to run `unit_visitation()` with arguments `start_year = 1980` and `end_year = 2021` across ROMO, EVER, and THRO. Then, create a single data frame containing each park units' output.** ```{r} - +output_map <- parks %>% + map(~ unit_visitation(unit = ., start_year = 1980, end_year = 2021)) %>% + bind_rows() ``` diff --git a/02_data-wrangling.Rmd b/02_data-wrangling-TessaBirt.Rmd similarity index 76% rename from 02_data-wrangling.Rmd rename to 02_data-wrangling-TessaBirt.Rmd index 4452ef6..6c4af44 100644 --- a/02_data-wrangling.Rmd +++ b/02_data-wrangling-TessaBirt.Rmd @@ -57,7 +57,10 @@ parkwide <- years %>% **Using the `unit_visitation()` function from the last lesson and mapping, pull visitor data from 1980-2021 for the following park units: ROMO, ACAD, LAKE, YELL, GRCA, ZION, OLYM, and GRSM. Name the final output `units`.** ```{r} - +parks2 <- c("ROMO", "ACAD", "LAKE", "YELL", "GRCA", "ZION", "OLYM", "GRSM") +units <- parks2 %>% + map(~ unit_visitation(unit = ., start_year = 1980, end_year = 2021)) %>% + bind_rows() ``` ## Exploring our data @@ -128,7 +131,17 @@ plotly::ggplotly( **Create an interactive graph with two separate panes: one showing park-wide visitation, the other showing all the individual park units together. Both panes should have different y-axes.** ```{r} +annual_visitation <- annual_visitation %>% + mutate(groups = ifelse(UnitCode == "Parkwide", "Park wide", "Parks")) +plotly::ggplotly( + ggplot(data=annual_visitation) + + geom_point(aes(x = Year, y = RecVisitation, color = UnitCode)) + + geom_path(aes(x = Year, y = RecVisitation, color = UnitCode)) + + scale_y_continuous(labels = scales::label_scientific()) + + facet_wrap(~groups, scales = "free_y") + + theme_bw(base_size=10) +) ``` It is pretty clear that some park units get orders of magnitude more visitors than others. But just how much of the total park visitation do each of these parks account for from year to year? Here we walk through two methods to tackle this question, ***pivoting*** and ***joining***, to get park unit visitation side-by-side with park-wide data. @@ -161,7 +174,24 @@ long_data <- wide_data %>% **Using `wide_data` as the starting point, create an interactive time series plot showing the annual percentage of the total visitation made up by all park units. In other words, a visual that allows us to see how much each park unit contributes to the total park visitation across the NPS system.** ```{r} - +visitation_percentage <- wide_data %>% + mutate_at(.vars = c("ACAD", "ROMO", "GRCA", "LAKE", "YELL", "OLYM", "ZION", "GRSM", "Parkwide"), .fun = ~(. / Parkwide *100)) + +visitation_percentage_long <- visitation_percentage %>% + pivot_longer(cols = -Year, + names_to = "Park", + values_to = "RecVisitation") + +visitation_percentage_long <- visitation_percentage_long %>% + filter(Park != "Parkwide") + +plotly::ggplotly( + ggplot(data=visitation_percentage_long) + + geom_point(aes(x = Year, y = RecVisitation, color = Park)) + + geom_path(aes(x = Year, y = RecVisitation, color = Park)) + + scale_y_continuous(labels = scales::label_scientific()) + + theme_bw(base_size=10) +) ``` ## Joining @@ -179,7 +209,21 @@ joined_data <- inner_join(x = units, y = parkwide, by = c("Year","Month")) **Using `joined_data` as the starting point, create an interactive time series plot showing the annual percentage of the total visitation made up by all park units. This plot should look nearly identical to the previous plot.** ```{r} +joined_annual_visitation1 <- joined_data %>% + group_by(Year, UnitCode.x, UnitCode.y) %>% + summarise(RecreationVisitors.x = sum(RecreationVisitors.x), RecreationVisitors.y = sum(RecreationVisitors.y)) + +joined_annual_visitation <- joined_annual_visitation1 %>% + mutate_at(.var = c("RecreationVisitors.x", "RecreationVisitors.y"), .funs = ~(. / RecreationVisitors.y *100)) %>% + select(Year, UnitCode.x, RecreationVisitors.x) +plotly::ggplotly( + ggplot(data=joined_annual_visitation) + + geom_point(aes(x = Year, y = RecreationVisitors.x, color = UnitCode.x)) + + geom_path(aes(x = Year, y = RecreationVisitors.x, color = UnitCode.x)) + + scale_y_continuous(labels = scales::label_scientific()) + + theme_bw(base_size=10) +) ``` ### Exercise #5 {style="color: maroon"} @@ -187,5 +231,10 @@ joined_data <- inner_join(x = units, y = parkwide, by = c("Year","Month")) **Which park on average has the most visitation? Which park has the least visitation? Base your response on the data starting in 1990, ending in 2021. Defend your answer with numbers!** ```{r} - +units1 <- units %>% + filter(Year >= 1990) %>% + group_by(UnitCode) %>% + summarise(avg_visitation = mean(RecreationVisitors, na.rm = TRUE)) ``` + +GRSM had the most visitation from 1990 to 2021 on average (825651.1) and ACAD had the least (219688.8). diff --git a/APIs-Functions-Iteration_TessaBirt.Rproj b/APIs-Functions-Iteration_TessaBirt.Rproj new file mode 100644 index 0000000..d64e28b --- /dev/null +++ b/APIs-Functions-Iteration_TessaBirt.Rproj @@ -0,0 +1,16 @@ +Version: 1.0 + +RestoreWorkspace: Default +SaveWorkspace: Default +AlwaysSaveHistory: Default + +EnableCodeIndexing: Yes +UseSpacesForTab: Yes +NumSpacesForTab: 2 +Encoding: UTF-8 + +RnwWeave: knitr +LaTeX: pdfLaTeX + +AutoAppendNewline: Yes +StripTrailingWhitespace: Yes