Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update OSD parsing: effervescence, drop ordered factors #68

Merged
merged 17 commits into from
Oct 25, 2023

Conversation

brownag
Copy link
Member

@brownag brownag commented Oct 25, 2023

Extension of #67 to cover non-standard, typos etc. effervescence classes.

This removes the creation of ordered factors from several parsed data elements.
The JSON output does not support factor encoding, and it is extremely common to have typos in OSDs. In general it can't be assumed that standard verbiage and choice lists are used. Returning the values as plain text will facilitate error checking and or writing rules to autoconvert classes to the standard set.

@brownag brownag changed the title update effervescence, drop ordered factors update OSD parsing: effervescence, drop ordered factors Oct 25, 2023
@brownag
Copy link
Member Author

brownag commented Oct 25, 2023

A further wrinkle with effervescence, similar to colors in moist/dry/concentrations etc. is that there can be far more than one or even two classes reported depending on what part of the matrix or other features are being described in that section of the narrative.

brownag and others added 14 commits October 25, 2023 11:40
 - greediness: preferentially take first class (after comma or semicolon most likely to be matrix eff)
 - allow ranges of classes separated by "to" (like drainage class)
 - note there are a wide range of narrative comments still not yet handled e.g. "slightly effervescent but strongly effervescent in spots"
@brownag
Copy link
Member Author

brownag commented Oct 25, 2023

OK. A lot more effervescence information has now been been captured.

Merging, will add some new stuff to typical pedon reporting tools to tabulate and identify particularly egregious issues. There are many cases with multiple classes without much qualification about what features were being tested, and even more cases where there are long-form descriptions of the pattern of effervescence. Possibly could parse the entire phrase between semicolons, but the formatting is not 100% consistent so this would likely lead to some problems/unintended inclusion of non-effervescence data

@brownag brownag merged commit dc9c537 into main Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant