-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data loss when using uncode() #253
Comments
Hey, thanks for bringing this to our attention. There are a couple issues with uncoding coded values and setting the factor levels, and we have been meaning to cycle back to column name uncoding and factor related problems in #241, #242 As you have observed SDA has all of its data pre-decoded. I believe the only reason that The plan was to abstract out the factor level finding code from uncode so it does not need to be handled this way, which has a high likelihood of wiping out values when data are factors or something is a bit different about the database/generic assumptions associated with a database. Thanks for bringing it up, I think I had been aware this could happen, and thought it had been fixed. Since we have no direct tests of the output of these functions at present, it escaped me, sorry for any inconvenience I will move resolving this and related factor issues to the top of the pile. I found an issue in uncode() pertaining to SDA that should be a simple fix, I think there are some paired issues the post-processing e.g. .cosoilmoist_prep() |
I made some progress on this yesterday, and think the primary issue has been resolved at least for You can install a development branch with: remotes::install_github("ncss-tech/soilDB@factorsETC") Would be glad to have you test it out, let me know if you see anything else that is off. |
Thanks for fixing this @brownag! The development version is working great now. |
I recently updated soilDB from 2.6.14 to 2.7.2. Since then I have had an issue with the cosoilmoist table, and I think I was able to track the issue down to the
uncode()
function.Bear with me: I am trying to access the cosoilmoist table so that I can get water table depth low, representative, and high values. In the past I was able to isolate this by filtering the rows where status == "Wet" and selecting dept_l, dept_r, and dept_h.
For the sake of this example:
where <- "mukey IN ('426539', '426540', '426541')"
When I use
soilDB::get_cosoilmoist_from_SDA(WHERE = where)
there is no data in the "status" column (and also the "month" column is empty). I've looked at this in a few different counties throughout WI and IL and it's always the case, so it's not an issue with inconsistent data collection.Before I knew this function was available in soilDB I made my own query and ran it through
SDA_query()
. Here is my query:q.cosoilmoist <- paste0("SELECT co.mukey, co.cokey, co.comppct_r, com.*, cosm.* FROM component AS co LEFT JOIN comonth AS com ON com.cokey = co.cokey LEFT JOIN cosoilmoist AS cosm ON cosm.comonthkey = com.comonthkey WHERE ", where, " ORDER BY co.mukey;")
After running
d.cosoilmoist <- soilDB::SDA_query(q.cosoilmoist)
, the month and soimoiststat values are in the table! But after runningd.cosoilmoist <- soilDB::uncode(d.cosoilmoist, droplevels = FALSE)
, the month and soimoiststat values are gone. So it looks like there is something going on inuncode()
.I looked at a couple other
get_(something)_from_SDA()
functions and they also seem to have some lost data.The text was updated successfully, but these errors were encountered: