covidCensus

The goal of covidCensus is to easily join county level COVID-19 case and death data with other information from the census (eg income, demographic statistics, population density). The purpose of its creation was educational. I had been exploring COVID-19 data for awhile and wanted to observe patterns between health outcomes and other social markers.

Note: I am not a professional researcher, just a hobbyist. Any findings from this study should be taken with a grain of salt.

Installation

covidCensus is not on CRAN, but this is how to install it from Github.

devtools::install_github("shuckle16/covidCensus")

Package Structure

Currently there are 4 fetch_* functions for retrieving the data you want. Each of them returns the raw response from an http request.

fetch_nyt
fetch_census_dens
fetch_census_demo
fetch_census_income

There are also 4 corresponding tidy_* functions to clean up the above data so they can be joined together. Each of them returns a tidy dataframe (tibble) where each row represents a county.

tidy_nyt
tidy_census_dens
tidy_census_demo
tidy_census_income

Example

Some sample code which uses the package to plot the correlation between median income and virus deaths per capita.

library(dplyr)
library(ggplot2)
library(covidCensus)

tidied_nyt <- fetch_nyt() %>% tidy_nyt()

tidied_census_dens <- fetch_census_dens() %>% tidy_census_dens()

tidied_census_income <- fetch_census_income() %>% tidy_census_income()

join_datasets <- function() {
  tidied_census_income %>%
    inner_join(tidied_census_dens, by = c("county", "state")) %>% 
    inner_join(tidied_nyt)
}

joined_data <- join_datasets() 

joined_data %>%   
  filter(county != "New York") %>% 
  ggplot(aes(x = median_income, y = (deaths / pop) * 100000)) + 
  geom_point(alpha = 0.7) + 
  scale_x_continuous(labels = scales::comma) +
  xlab("Median Annual Income ($)") + 
  ylab("Deaths per 100k") + 
  ggtitle(
    "COVID-19 Death rate in the USA versus median income", 
    subtitle = paste("Each dot is a county. Deaths data from", max(tidied_nyt$date))
    )

To do list

Consider using {polite} package instead of xml2. There might be a nicer way to make the census api calls, too. Suggestions welcome
Find other interesting county-level variables
Add tests? consistent names from the tidy functions, etc
Add a vignette or two
Fix the warnings in tidy_census_income

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
R		R
man		man
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
covidCensus.Rproj		covidCensus.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

covidCensus

Installation

Package Structure

Example

To do list

About

Releases

Packages

Languages

shuckle16/covidCensus

Folders and files

Latest commit

History

Repository files navigation

covidCensus

Installation

Package Structure

Example

To do list

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages