-
-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
puma
argument not working with 5-year 2018-2022 ACS PUMS
#555
Comments
Handling with an error for now: https://github.com/walkerke/tidycensus/blob/master/R/pums.R#L99-L106. Need to think through a better solution, though. |
This is a bummer. I submitted a ticket to Census. Perhaps we can encourage others? This effectively makes microdata at the PUMA level useless for the next several years, as I understand it. |
Yeah - I'd like to come up with some sort of solution, though it seems like it'd be a novel one as I don't see PUMA reconciliation done by Census or by IPUMS. We possibly could come up with a crosswalk between the two. The tricky thing is that while this is easier for faster-growing areas (PUMAs are typically split into multiple new PUMAs) to go backwards from 2020 to 2010, it is harder for slower-growing areas where PUMAs are consolidated or re-organized (common in rural areas). |
How about using Geocorr? |
Hey @walkerke. When I'm trying to pull any other variable, either alone or in a combination up to 30 variables and I do NOT include VACS, it downloads just fine. But when I add VACS and even when I add return_vacant=true, it still won't run. Just flagging this. I am a pretty new user so I thought it was something I was doing, but after a day of investigation, I think the problem is with the new geographic designations. |
@elisemarie1120 This doesn't have anything to do with PUMA geographies I don't think; see #560 . Try re-installing from GitHub with |
@walkerke, hello! I work for a county-level local health department, and because get_pums doesn't have a "county" argument, I have been filtering PUMS data using the "puma" argument (regardless of whether I map or summarize to a PUMA level). And because a county is a smaller geography and we want to look at race subpopulations, we almost exclusively use the 5-year surveys. So this issue is affecting us greatly; I worry that we won't be able to see 5-year data specific to our county more recent than 2017-2021 until 2026, which isn't ideal. Is there a way you could add a "county" argument to "get_pums"? Is there a geography nesting problem? I would imagine that PUMAs are nested within county boundaries, but I don't know that. Relying on filtering on the PUMA values when they are not consistent across the 5-year period for half of every decade delays us being able to access data (particularly around social determinants of health) that are important to us. I appreciate any other advice you have to give as well! |
I'm replying to @walkerke's January 26, 2024 message, and I'm the epidemiologist working for a county local health jurisdiction (but a populous one: larger than 16 states). If we were to run the single years on their appropriate PUMA values (so PUMA10 2018-2021, PUMA20 on 2022, or similarly for 2019-2023 once it's released), do you have any advice on how to average across the five years? For estimates, we could take the mean (I think?). But what about the moe? That might be the needed workaround, assuming that the Census doesn't release a PUMA value crosswalk: functions that allow us to consolidate single-year data into ad hoc 5-year data. Otherwise we end up unable to report on our smaller county subpopulations who really want information about how they're doing. |
The 2022 PUMS uses the new 2020 PUMAs for the first time. This will be an issue for the next few years, as the samples prior to 2022 use the 2010 PUMAs, but the samples 2022 and later use the 2020 PUMAs.
Census doesn't reconcile this in the data; instead it marks unused PUMA definitions as
-0009
. See here:For prior years, we just throw an error message when PUMAs are requested and don't attempt to deal with it. See here: https://github.com/walkerke/tidycensus/blob/master/R/pums.R#L99-L102
This feels unsatisfactory to me as this will impact everything through the 2021-2025 ACS 5-year (which will be released in 2027!).
I'd like to think through how to handle this appropriately.
For users: you can still pass a vector of PUMAs using
variables_filter
to eitherPUMA10
orPUMA20
in the new data. Though it may make sense for you to pull first by state (which I know is a hefty download for TX / CA) then filter carefully within R.The text was updated successfully, but these errors were encountered: