-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Usage of core:Facet sub-subclasses creates RDFS-based confusion #445
Comments
This issue has nothing to do specifically with Facets. The issue here is simply to do with rdfs subclassing combined with SHACL property shapes and how they interact with subclassing hierarchies.
could just as easily be
The Facet classes simply apply to the There seems to be two potential issues raised here:
The first issue is very simple to address. While redundant property shape assertions are completely valid in SHACL they are not required and can simply be removed from the subclasses leaving only the shape at the top (ROOT) of the hierarchy. The large majority of users will be looking to the autogenerated documentation (rather than directly at the turtle) to understand UCO. This documentation clearly shows the relevant properties for each class including those "inherited" at each level of its position in a class hierarchy. So, with the property shape for addressValue asserted at the DigitalAddressFacet level it will still clearly be shown as a single occurrence when looking at the documentation for WiFiAddressFacet. The documentation also shows (right at the top of the page) where a particular class sits in a class hierarchy including not only superclasses but also subclasses which should make it straightforward for a user to find the most appropriate class for what they are trying to express. This documentation should mean that there is little potential for confusion on number 1 once redundant shapes are removed.
I do not believe this is true. As long as the differential constraints are more rather than less restrictive as you descend the class hierarchy then there should be no problems and this is exactly as SHACL is designed to work.
{
"@id": "kb:wifi-address-3",
"@type": "observable:WifiAddress",
"core:hasFacet": {
"@type": "observable:DigitalAddressFacet",
"observable:addressValue": "[email protected]"
}
}
This decoupling is fundamentally necessary to support Duck typing. In Duck typing you may start with an ObservableObject object with one or more observable facet subclasses on it to characterize the details you know even if you are unsure yet what object(s) you are describing. You could then analyze what facets and properties are on the generic object and reach some conclusion as to what type of object or objects there are and then define these specific object types (e.g., MACAddress) and attach the relevant facet(s) on it. This is the entire concept of Duck typing. The above json example would be perfectly legal if the propertyshape for addressValue at the DigitalAddressFacet provided no further constraint than xsd:string. And this is as it should be as it is the basis for RDFS subclassing and applying SHACL to it. The user of an email address string in a DigitalAddressFacet on a WiFiAddress ObservableObject would be a perfect example of a user drawing poor conclusions from a Duck typing situation where they looked at the email address string in the DigitalAddressFacet and thought "yeah, that looks like a WiFiAddress. They could have easily made better conclusions. {
"@id": "kb:trip-1",
"@type": "Trip",
"methodOfTransport": {
"@type": "Vehicle",
"engineCylinderCount": 4
}
} when it would be more appropriate to express as: {
"@id": "kb:trip-1",
"@type": "Trip",
"methodOfTransport": {
"@type": "Sedan",
"engineCylinderCount": 4
}
} If {
"@id": "kb:trip-1",
"@type": "Trip",
"methodOfTransport": [
{
"@type": "Vehicle",
"engineCylinderCount": 4
},
{
"@type": "Automobile",
"engineCylinderCount": 4
},
{
"@type": "Sedan",
"engineCylinderCount": 4
}
]
} This is simply a matter of users understanding subclasses and the fact they should use the most appropriate one to their needs. I do not think we can do much about number 2 above. Understanding how subclassing works is a conceptual issue at a far broader scope than specifying UCO. Net-Net: The core issues in play are that
|
@sbarnum I believe you and I are at an impasse. I am not asserting UCO users would not understand subclassing. I am asserting that Facets are incompatible with subclassing for |
Once again, the "parallel" nature of UcoObjects such as observable:File and a facet of properties such as observable:FileFacet is only relevant to Duck typing and not to facets in general. And in reality the vast majority of observable facets to support Duck typing do NOT involve any "parallel" hierarchies. ContactFacet, DefinedEffectFacet and DigitalAddressFacet are the only such hierarchies. Duck typing utilizes the facet pattern but facets are useful and necessary beyond just Duck typing. |
In response to discussion from Tuesday's meeting, this Issue is being re-scoped. Several of the See for example: https://ontology.unifiedcyberontology.org/uco/documentation/class-observablewifiaddressfacet.html
Requirement 1, as written, is being stricken. Requirement 2 is being put in as a replacement, accomplishing some of the same goals:
Implementation happened to need to incorporate the resolution for PR 417 due to dropping constraints for one of the redundancies, so the merge of the resolution of this Issue has to wait for 417's merge (and also vote) to happen first. Another light side-effect is I came across some A less-light side-effect that came out of removing the redundancies is that there are now several new empty
|
These property shapes were identified for removal by visual review of the results of this query: ```sparql SELECT DISTINCT ?nClass WHERE { { ?nClass rdfs:subClassOf/rdfs:subClassOf+ core:Facet . } UNION { ?nClass rdfs:subClassOf core:Facet . ?nSubClass rdfs:subClassOf ?nClass . } } ORDER BY ?nClass ``` The same shapes were identified for removal by visual review of the results of this query: ```sparql SELECT DISTINCT ?nClass ?nSubClass ?nProperty WHERE { ?nClass rdfs:subClassOf* core:Facet ; sh:property/sh:path ?nProperty ; . ?nSubClass rdfs:subClassOf+ ?nClass ; sh:property/sh:path ?nProperty ; . } ORDER BY ?nClass ?nSubClass ?nProperty ``` References: * #445 Signed-off-by: Alex Nelson <[email protected]>
These property shapes were identified by visual review of the results of this query: ```sparql SELECT DISTINCT ?nClass ?nSubClass ?nProperty WHERE { ?nClass sh:property/sh:path ?nProperty ; . ?nSubClass rdfs:subClassOf+ ?nClass ; sh:property/sh:path ?nProperty ; . } ORDER BY ?nClass ``` (The difference since the last patch is `?nClass` is no longer tied to `Facet`s.) This patch alone will trigger a CI failure from the SHIR code base, due to SHIR 0.2.0 flagging dropped constraints as errors. A follow-on patch will merge in the resolution to PR 417 in order to resolve this bug without needing to work through addressing semi-open vocabulary issues when subclasses are involved. Incidentally, Issue 442 is now mooted. References: * #417 * #442 * #445 Signed-off-by: Alex Nelson <[email protected]>
References: * #445 Signed-off-by: Alex Nelson <[email protected]>
I would suggest that we strive to not add any NEW empty facets without a clear rationale to do so but that we do NOT remove any existing empty facets. |
So, for the |
Yes |
No effects were observed on Make-managed files. References: * ucoProject/UCO#445 Signed-off-by: Alex Nelson <[email protected]>
No effects were observed on Make-managed files. References: * ucoProject/UCO#445 Signed-off-by: Alex Nelson <[email protected]>
A follow-on patch will regenerate Make-managed files. References: * ucoProject/UCO#445 Signed-off-by: Alex Nelson <[email protected]>
References: * ucoProject/UCO#445 Signed-off-by: Alex Nelson <[email protected]>
Note from OC Chair: The risk of this proposal is between low and moderate, so normally would not merit a Fast-Track review. However, the usage pattern being addressed may be more harmful than good to leave in 1.0.0. If OC members object to fast-tracking this, it will be staged as a 2.0.0 proposal.
Background
There are a few
core:Facet
subclasses that have a parent class between the class andcore:Facet
. One example isobservable:WifiAddressFacet
, an eventual subclass ofobservable:DigitalAddressFacet
. In that instance, and if I recall correctly some others as well, property shapes are defined redundantly onDigitalAddressFacet
,MACAddressFacet
, andWifiAddressFacet
. Their total definitions in today'sdevelop
branchobservable.ttl
are:The property shape for
observable:addressValue
is repeated across all three of those subclass layers. This creates a point of user confusion, potential incompatibility under RDFS expansion, and further potential incompatibility under OWL expansion.Take this snippet of data:
If a user performed RDFS expansion, the following classes would be added by merit of
rdfs:subClassOf
statements (spelled here in leaf-to-owl:Thing
class order):If a user is unaware of RDFS expansion, they might feel obligated to create all of the intermediary
Facet
s, and redundantly store the data:This would be a nuisance data behavior for UCO to require (causing much data bloat). It is also unnecessary because of something not yet encoded in UCO, but intended according to several conversations over the years, that there should only be one instance of class
XFacet
attached to each instance of classX
. This can be requested in a way with aowl:qualifiedCardinality 1
oncore:hasFacet
on eachUcoObject
subclass, but it would be a supreme nuisance to users to enforce in SHACL.Under RDFS expansion, we end up with three independent instances of
observable:DigitalAddressFacet
attached tokb:wifi-address-2
. This situation gets worse if a user doesn't realize the RDFS expansion, and further thinks they can use the extraaddressValue
slots in "Independent" parentFacet
classes to stash other data, like a second MAC Address on a two-NIC server:The
Facet
subclasses might be helpful for visualizing subclass hierarchies. However, the redundant storage of properties outside of the "Leaf"Facet
subclasses ends up, through the above illustrations, being harmful to users.These significantly confusing patterns are one of the motivations for doing away with
Facet
s altogether. However, in the event that Issue 438 does not pass, this proposal is offered for consideration.Requirements
For reasons explained in the Risks section, there is no technical solution possible while
Facet
s exist as a parallel hierarchical class structure to theUcoObject
class hierarchy.Requirement 1
Requirement 1 has been stricken. It was previously:
Requirement 2
UCO must not wholly repeat property sh:PropertyShapes between subclasses - that is, define a completely matching set of SHACL constraints in a class C and subclass D.
Risk / Benefit analysis
Benefits
Facet
subclass hierarchy, or even use theFacet
subclass hierarchy if there is no need.Facet
s, users that follow the guidance would be less likely to run into RDFS expansion issues that violate the not-yet-encoded design intent of "An instance ofUcoObject
classX
should have exactly oneXFacet
."Risks
The risks pertain to continued usage of
Facet
s, and not to any implementation changes due to this proposal.The risks stem from not being able to recommend de-duplication of properties on
Facet
s in UCO's current implementation.Putting all of the property shapes into just one "End" of the
Facet
hierarchy has potential for confusion, whether the end is the leaf subclasses, or the "root" subclass.If the properties are in the "Leaf" end, it is difficult to encode in SHACL how to say what each leaf must have in terms of property shapes without inducing the usage confusion above. ("When do I create a
WifiAddressFacet
AND aDigitalAddressFacet
on one node?" will arise if users miss language in the class documentation comments.) This would be a breakage with most benefits of class hierarchies.If the properties are in the "Root" end, different problems may arise from UCO inability to enforce constraints at multiple different levels of the
Facet
subclass hierarchy. SupposeaddressValue
was given a property shape inDigitalAddressFacet
, and that shape was not copied to the leaf nodes. Then supposeMACAddressFacet
defines an independent, supplementary property shape that constrainsaddressValue
to follow a certain regex (hex-byte, colon, hex-byte, colon, and so on). AWifiAddress
(notFacet
) instance could be defined like this and pass SHACL validation:This would pass with an email address being the value "of" a
WifiAddress
because the subclass hierarchy ofDigitalAddress
--MACAddress
--WifiAddress
is completely decoupled fromDigitalAddressFacet
--MACAddressFacet
--WifiAddressFacet
.Last, it is not clear how to reconcile these issues with an object that is rightly one of the
UcoObject
classes corresponding to one of the intermediaryFacet
subclasses. (And likewise for any extension subclass anywhere along either class hierarchy.) If the correspondingFacet
has no property shapes, the user has no guidance or validation available for metadata meant to be attached to the object. This is also a reason that the intermediary subclass hierarchy ofFacet
s probably can't just be deleted outright.The proposer believes this is a design flaw with
Facet
s, and encourages further consideration of Issue 438. None of this proposal would be necessary if properties were directly onUcoObject
subclasses.Competencies demonstrated
The potential errors we wish to prevent are demonstrated in the background and risks.
Solution suggestion
Facet
subclasses. Append this string statement to their documentation strings, as a standalone line: "The most specific subclass of Facet should be used for the corresponding UcoObject. No parent class of this Facet should also be instantiated on this object."The multi-level
Facet
subclasses can be identified with this query:From UCO's current
develop
state (ba9da19
), these classes would need that annotation:Coordination
develop
The text was updated successfully, but these errors were encountered: