-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get_mapunit_from_SDA() fails for some mukeys #305
Comments
Hi @kevinwolz thanks for reporting this issue. Or, rather, I see two different issues. Approach 1 (that fails) appears to be due to scientific notation formatting being applied to a small subset of the mukeys (only one that I can see, actually. Specifically the problem To avoid this problem use explicit conversion to integer with where1 <- as.integer(mukeys1) %>%
paste0("'", ., "'") %>%
toString() %>%
paste0("mukey IN (", ., ")") We have tried to handle this type of input sanitization in places that take As far as why Approach 2 only returns a fraction of the mukeys that Approach 1 does--this is because they are querying a different extent. In the case of the In contrast, the aoi <- sf::st_read("/vsizip/Bannock_ID.kml.zip") # Bannock County, ID
x <- soilDB::mukey.wcs(aoi = aoi, db = "gssurgo", res = 30, quiet = TRUE)
plot(x)
plot(terra::project(terra::vect(aoi), x), add=TRUE) |
@kevinwolz, if you are interested in spatial data for all of CONUS, I'd highly recommend using either / both of
The mukey grids are kind of like a gSSURGO/gNATSGO-lite, as most of the tabular data can be accessed via SDA. Note that raster soil survey (RSS) data are not yet in SDA. I'm glad that you are interested in using the mukey WCS. More information on that here: |
Thank you @brownag for explaining both cases! Very helpful. It would have taken me forever to figure out the scientific notation piece. I've added integer conversion to my script. And for the second part, I forgot that |
@dylanbeaudette thank you for your suggestions. I think I'm doing what you're suggesting, but maybe I'm missing something. Here's my general process:
Are you suggesting that I start with the full CONUS mukey grid that you linked to? I wish I could do that, but, at the 30m resolution that I need, I think it just takes too much memory to work with the full CONUS grid. My analysis step is complex, memory intensive, and requires ultimately saving data in Does this seem logical? Am I missing something? |
@dylanbeaudette one thing that I think is not good with my method is that there will be large gaps in the CONUS data where SSURGO data does not exist but where STATSGO is only available. The SoilWeb Soil Properties map does not have these gaps, but perhaps that is not using SSURGO data since the resolution there is much lower than 30m. How would you recommend dealing in a way that balances the 30m requirement and processing intensity of the analysis with needing to get CONUS with no gaps? Perhaps I need to be checking to see if |
@kevinwolz I'll reply in detail later today. Quick answer:
I have examples and further discussion to share later on today. Good luck! |
Hi, @kevinwolz here are some follow-up questions and suggestions. First off, are you willing to use STATSGO data in place of missing SSURGO data? If so, then the gNATSGO mukey grid may solve some of your problems. The presence of RSS mukey values (and lack of tabular RSS data on SDA) creates a couple new problems. You could "roll your own" SSURGO/STATSGO hybrid using map unit key grids. The STATSGO2 mukey grid is available via The SoilWeb Soil Properties maps are gap-free because "holes" in SSURGO (coarsened to a 800m grid) are back-filled with STATSGO. What is the final product that you have in mind? While this may not solve all of your problems, take a look at this experimental tiled approach to making CONUS maps. I still need to get in there and optimize a couple steps but it is pretty efficient. |
I think the main questions in this issue (failure with abbreviated mukey, and the differences between number of unique mukey by different methods) have been answered. We are delving into different, important topics. I see two main discussions which could continue here
For item 1, larger/local tabular data are reasonably well covered by soilDB methods, e.g. by specifing For item 2, there are official methods for how SSURGO is backfilled with STATSGO2 and/or Raster Soil Survey to create continuous coverage (gNATSGO), but these backfilled areas aren't necessarily suitable replacements for detailed SSURGO. Currently STATSGO data can be summarized like SSURGO, but often areless complete, lacks interpretations etc. Raster Soil Survey mukeys are usually quite detailed, but not yet available through any web service. Stitching together a queryable SQLite database that works for RSS mukeys can be done, but currently In some cases there would be value added to having soilDB functions process local spatial data, but it may be that certain types of processes, especially for large spatial data workflows, will be implemented using machinery outside the package. |
I have a script that pulls mapunit data for every county in CONUS. There are just 3 counties in all of CONUS that fail. Here is one of them:
Bannock_ID.kml.zip
aoi <- sf::st_read("Bannock_ID.kml") # Bannock County, ID
APPROACH 1 (fails)
APPROACH 2 (works)
Why does Approach 1 fail, when Approach 2 works?
Approach 1 seems to add many extra mukeys. None are duplicates. Something about or in those extra mukeys must be causing the problem.
When I run the same above approaches for any other county (except the 2 others that act the same: Brown County, NE and Caribou County, ID),
length(mukeys1)
is always greater thanlength(mukeys2)
, BUT Approach 1 does not produce an error. For example, here is Champaign, IL:Champaign_IL.kml.zip
Why does Approach 1 produce so many extra mukeys compared to Approach 2 for all regions?
The text was updated successfully, but these errors were encountered: