Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG2-VALIDATION_COUNTRYSTATEPROVINCE_CONSISTENT #200

Closed
Tasilee opened this issue Aug 28, 2022 · 19 comments
Closed

TG2-VALIDATION_COUNTRYSTATEPROVINCE_CONSISTENT #200

Tasilee opened this issue Aug 28, 2022 · 19 comments
Labels
Consistency Parameterized Test requires a parameter SPACE Supplementary Tests supplementary to the core test suite. These are tests that the team regarded as not CORE. Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 Validation VOCABULARY

Comments

@Tasilee
Copy link
Collaborator

Tasilee commented Aug 28, 2022

TestField Value
GUID e654f562-44f8-43fd-983b-2aaba4c6dda9
Label VALIDATION_COUNTRYSTATEPROVINCE_CONSISTENT
Description Are the combination of the values of dwc:country, dwc:stateProvince consistent with the values in the bdq:sourceAuthority?
TestType Validation
Darwin Core Class Location
Information Elements ActedUpon dwc:country
dwc:stateProvince
Information Elements Consulted
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if the terms dwc:country or dwc:stateProvince are bdq:Empty; COMPLIANT if the value of dwc:stateProvince occurs as an administrative entity that is a child to the administrative entity matching the value of dwc:country in the bdq:sourceAuthority, and the match to dwc:country is an ISO 3166 country-like administrative entity in the bdq:sourceAuthority; otherwise NOT_COMPLIANT
Data Quality Dimension Consistency
Term-Actions COUNTRYSTATEPROVINCE_CONSISTENT
Parameter(s) bdq:sourceAuthority
Source Authority bdq:sourceAuthority default = "The Getty Thesaurus of Geographic Names (TGN)" {[https://www.getty.edu/research/tools/vocabularies/tgn/index.html]}
Specification Last Updated 2024-09-18
Examples [dwc:country="Australia", dwc:stateProvince="WA": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="dwc:country is consistent with dwc:stateProvince. dwc:stateProvince matches Western Australia in dwc:country Australia"]
[dwc:country="Australia", dwc:stateProvince="Florida": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="dwc:country is not consistent with dwc:stateProvince"]
Source VertNet, Kurator
References
Example Implementations (Mechanisms) Kurator
Link to Specification Source Code https://github.com/kurator-org/kurator-validation/blob/master/packages/kurator_dwca/workflows/dwca_geography_assessor.yaml
Notes See table #95 (comment). A fail condition may arise from the content being internally inconsistent (not all of the information can be true at the same time), or from the vocabulary being incapable of resolving the combination of term values. This test should match despite leading or trailing whitespace or there are leading or trailing non-printing characters. @tucotuco: "Of #200 and #201, #201 is the strongest test. If it passes for a record, #200 must necessarily also pass and doesn't tell you anything. If #201 fails,#200 could still pass and that would tell you that there are multiple matches on the country/stateProvince combo: It would tell you the nature of the problem. Along with #42 (Country not empty), #200 would tell you whether there was an ambiguous combination of country (not empty) and stateProvince, such as would happen with Argentina/Buenos Aires. While if country is empty, then the ambiguity is purely at the stateProvince level".
@Tasilee Tasilee added TG2 Validation SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT VOCABULARY Consistency Parameterized Test requires a parameter labels Aug 28, 2022
@ArthurChapman
Copy link
Collaborator

Suggest description be
Are the combination of the values of the terms dwc:country, dwc:stateProvince consistent with values in the bdq:sourceAuthority?

@ArthurChapman
Copy link
Collaborator

Suggested Expected Response be altered to (changed in italics)

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if *either of *the terms dwc:country and dwc:stateProvince are EMPTY; COMPLIANT if the combination of values of dwc:country and dwc:stateProvince are consistent with the bdq:sourceAuthority; otherwise NOT_COMPLIANT

@Tasilee
Copy link
Collaborator Author

Tasilee commented Aug 29, 2022

Thanks @ArthurChapman - agreed and changed.

@chicoreus
Copy link
Collaborator

Suggest we conform more closely to the >1 logic in @tucotuco's table in #95

Perhaps change from:
EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if either of the terms dwc:country and dwc:stateProvince are EMPTY; COMPLIANT if the combination of values of dwc:country and dwc:stateProvince are consistent with the bdq:sourceAuthority; otherwise NOT_COMPLIANT

(1) change "either of" to "both", to allow for a checking for just one value (e.g. WA, row 10 in #95 (comment))

(2) either in the specification or in the notes indicate that "consistent with" means matched at least once.

To:

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if both of the terms dwc:country and dwc:stateProvince are EMPTY; COMPLIANT if the combination of values of dwc:country and dwc:stateProvince are found in combination at least once the bdq:sourceAuthority; otherwise NOT_COMPLIANT

@chicoreus
Copy link
Collaborator

@Tasilee 's comment on #201 #201 (comment) applies here.

Amending the specification to parallel #201, when only a single term is present the test can run, this is a case that is likely to lead to #200 returning COMPLIANT and #201 returning NOT_COMPLIANT:

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if the terms dwc:country and dwc:stateProvince are EMPTY; COMPLIANT if the combination of values of dwc:country and dwc:stateProvince are consistent with the bdq:sourceAuthority; otherwise NOT_COMPLIANT

chicoreus added a commit that referenced this issue Aug 29, 2022
…w copy of the test specifications as of 2022-08-29 including the new tests #199, #200, and #201.
@tucotuco
Copy link
Member

tucotuco commented Sep 4, 2022

I like the solution @chicoreus has proposed above. Maybe the expected response should parallel what I proposed for STATEPROVINCE_FOUND. Something like this:

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if the terms dwc:country and dwc:stateProvince are EMPTY; COMPLIANT if the value of dwc:stateProvince occurs as an administrative entity that is a child to the entity matching the value of dwc:country in the bdq:sourceAuthority; otherwise NOT_COMPLIANT

@chicoreus
Copy link
Collaborator

Paralell with #199 is good, we can probably tighten up (e.g. prevent compliant when county is mapped onto state province and state province onto country):

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if the terms dwc:country and dwc:stateProvince are EMPTY; COMPLIANT if the value of dwc:stateProvince occurs as an administrative entity that is a child to the entity matching the value of dwc:country in the bdq:sourceAuthority, and the match to dwc:country is an ISO country-like entity in the bdq:sourceAuthority; otherwise NOT_COMPLIANT

@Tasilee
Copy link
Collaborator Author

Tasilee commented Sep 4, 2022

Updated Expected Response.

@Tasilee
Copy link
Collaborator Author

Tasilee commented Sep 12, 2022

Added to Notes: "This test will fail if there are leading or trailing white space or non-printing characters."

@Tasilee
Copy link
Collaborator Author

Tasilee commented Dec 11, 2022

Changed Expected response from "and" to "or"

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if the terms dwc:country and dwc:stateProvince are EMPTY; COMPLIANT if the value of dwc:stateProvince occurs as an administrative entity that is a child to the entity matching the value of dwc:country in the bdq:sourceAuthority, and the match to dwc:country is an ISO country-like entity in the bdq:sourceAuthority; otherwise NOT_COMPLIANT

to

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if the terms dwc:country or dwc:stateProvince are EMPTY; COMPLIANT if the value of dwc:stateProvince occurs as an administrative entity that is a child to the entity matching the value of dwc:country in the bdq:sourceAuthority, and the match to dwc:country is an ISO country-like entity in the bdq:sourceAuthority; otherwise NOT_COMPLIANT

@ArthurChapman
Copy link
Collaborator

In the Notes the Reference to "See table #95 (comment)" (i.e. "See table #95 (comment))" will need to be updated - but not sure how we can reference the comment

#95 can be changed to "VALIDATION_GEOGRAPHY_CONSISTENT (78640f09-8353-411a-800e-9b6d498fb1c9)" but the comment and table won't appear there without us putting it somewhere we can reference it.

@Tasilee
Copy link
Collaborator Author

Tasilee commented Jul 11, 2023

Post Zoom 11/7/2023, I have aligned the Source Authority with the suggested syntax:

bdq:sourceAuthority default = "The Getty Thesaurus of Geographic Names (TGN)" [https://www.getty.edu/research/tools/vocabularies/tgn/index.html]

to

bdq:sourceAuthority default = "The Getty Thesaurus of Geographic Names (TGN)" {[https://www.getty.edu/research/tools/vocabularies/tgn/index.html]}

@Tasilee
Copy link
Collaborator Author

Tasilee commented Sep 7, 2023

This test should have Data Quality Dimension "Consistency" rather than "Conformance". Edited.

@Tasilee
Copy link
Collaborator Author

Tasilee commented Sep 18, 2023

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted".

Also changed "Field" to "TestField", "Output Type" to "TestType" and updated "Specification Last Updated"

@chicoreus chicoreus added the CORE TG2 CORE tests label Sep 18, 2023
@chicoreus
Copy link
Collaborator

Edited "fail" text to indicate that this complex comparison should not be affected by leading or trailing whitespace.

@Tasilee
Copy link
Collaborator Author

Tasilee commented Aug 12, 2024

Updated Notes from @tucotuco's Comment #21 (comment) which I thought was needed here.

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Aug 15, 2024
…STATEPROVINCE_CONSISTENT, tweaks to GettyLookup need cleanup and caching, implementation runs but is ugly. Passes some, but not all validation tests, and should be able to pass more.
chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Aug 16, 2024
…ation of tdwg/bdq#200 with some support for caching of responses from Getty TGN.  Adding a minimal implementation of tdwg/bdq#32 with backing method to interpret a few common forms of verbatim latitudes and longitudes.
@ArthurChapman
Copy link
Collaborator

Updated the Expected Response to add "administrative" entities

@ArthurChapman ArthurChapman added Supplementary Tests supplementary to the core test suite. These are tests that the team regarded as not CORE. and removed CORE TG2 CORE tests labels Aug 19, 2024
@chicoreus
Copy link
Collaborator

Discussion in TG2 working group meeting in Seattle, this test lacks sufficient distinct power from other tests to include as core. The parent/child relationship of state/province and country is covered in #201, and we have validations for whether state province and country exist, effectively covering the difference of and/or in the internal prerequisites not met clause of #200 and #201. The implementation of the different text for the COMPLIANT clause in #200 and #201 ended up being identical, so placing this test as supplementary and retaining 201 as core.

@Tasilee
Copy link
Collaborator Author

Tasilee commented Sep 18, 2024

Added 3166 qualifier to the ISO ref in the Expected Response and added two ISO 3166 references

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Feb 21, 2025
…OVINCE_CONSISTENT testing some more country-stateProvince relationships with failover lookup of country from stateProvince added.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Consistency Parameterized Test requires a parameter SPACE Supplementary Tests supplementary to the core test suite. These are tests that the team regarded as not CORE. Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 Validation VOCABULARY
Projects
None yet
Development

No branches or pull requests

4 participants