You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to lay out some clear parameters for what we will be considering a "valid" JSON-LD object, and also possibly make some adjustments with respect to the Psych-DS specification as laid out in the Google Doc.
Here are some of the official requirements of JSON-LD, as found here:
A JSON-LD document MUST be able to express a linked data graph* (elaborated below) []
A JSON-LD document MUST be a valid JSON document [X]
All JSON constructs MUST have semantic meaning in a JSON-LD document: [X]
JSON arrays MUST NOT be interpreted as defining an object ordering. [X]
(There are other bullets in their list, but they are all SHOULDS and MAYs, where we are most interested in the MUSTs
Guidelines for a linked data graph: (in the same doc as above)
Subject, objects and edges all SHOULD be identified with IRIs
^^^Part of my issue with this combination of requirements is that they seem so bottom out with "being valid JSON", because:
Even though a JSON-LD document must be expressable as a linked data graph, the requirements for a linked data graph are all non-normative, all SHOULDs.
The requirement that JSON constructs must have meaning refers to something intentional rather than technical. That is, from what I can tell, it's not saying that all JSON constructs must be linked to some informative IRI, it's saying that the user must not use values that don't mean anything
The requirement that arrays must be interpreted as unordered is a matter of interpretation, not computer validation.
In the world of JSON-LD there are an abundance of SHOULDs and barely any MUSTs. What we have to decide is whether to codify our own set of MUSTs for Psych-DS specific JSON-LD, or to just keep valid JSON format as the only MUST and implement the variety of SHOULDs as warnings.
For instance, we allow users to include non-schema.org keys (or rather, string keys that don't link to any IRI) within their metadata, which is allowed according to strict JSON-LD rules, but recommended against. Here are some questions:
Do we want to require that schema.org context MUST be included, and that the required terms of our spec such as "name" and "variableMeasured" MUST expand to their full schema.org IRIs?
Do we want to allow for expanded, contextless JSON-LDs as valid metadata files?
If we do choose to implement the full gamut of JSON-LD SHOULDs, are we prepared to present those recommendations to the user, at risk of overwhelming them?
Do we want to allow for namespaces other than schema.org in the context?
Do we want the validator to check that JSON-LD IRIs actually point to real web pages? [This has implications for our eventual python version, for which offline functionality is a desideratum]
There are other questions, but this set covers the gist of it. Including some misc. references below, such as Best Practices and the official "JSON-LD grammar":
(Interesting point from the above grammar: unlinked keys in the JSON-LD MUST be ignored when processed. We may want to remind users that adding unlinked keys to their metadata does not technically add to its richness, since it will be ignored during any official processing on the web)
additional MUSTs that we can glean from the grammar:
A term MUST NOT equal any of the JSON-LD keywords, other than @type.
When used as the prefix in a Compact IRI, to avoid the potential ambiguity of a prefix being confused with an IRI scheme, terms SHOULD NOT come from the list of URI schemes as defined in [IANA-URI-SCHEMES]. Similarly, to avoid confusion between a Compact IRI and a term, terms SHOULD NOT include a colon (:) and SHOULD be restricted to the form of isegment-nz-nc as defined in [RFC3987].
To avoid forward-compatibility issues, a term SHOULD NOT start with an @ character followed exclusively by one or more ALPHA characters (see [RFC5234]) as future versions of JSON-LD may introduce additional keywords. Furthermore, the term MUST NOT be an empty string ("") as not all programming languages are able to handle empty JSON keys.
After doing a deeper dive into the jsonld.js package, I can see that it does produce error messages that correspond directly to a lot of the MUSTs from the JSON-LD Grammar. These mostly seem to revolve around restricted usages for the various "@" keywords.
This is great, because it means we can offload a lot of this fine-grained syntactic validation of json-ld objects to the official package itself, funneling its error messages into our app's validation "issues" that get presented to the user. One nice thing about these error cases is that they only really arise when you begin to use some of JSON-LDs more complex features, so there's not as much of a worry of these checks being prohibitive to beginners.
There's another category of JSON-LD MUSTs that result in ignored content rather than an error message. For instance, in the JSON-LD playground, using a key that resolves to a string instead of an IRI results in that key being dropped. We have to decide whether such violations ought to be errors or warnings.
We need to lay out some clear parameters for what we will be considering a "valid" JSON-LD object, and also possibly make some adjustments with respect to the Psych-DS specification as laid out in the Google Doc.
Here are some of the official requirements of JSON-LD, as found here:
(There are other bullets in their list, but they are all SHOULDS and MAYs, where we are most interested in the MUSTs
Guidelines for a linked data graph: (in the same doc as above)
^^^Part of my issue with this combination of requirements is that they seem so bottom out with "being valid JSON", because:
In the world of JSON-LD there are an abundance of SHOULDs and barely any MUSTs. What we have to decide is whether to codify our own set of MUSTs for Psych-DS specific JSON-LD, or to just keep valid JSON format as the only MUST and implement the variety of SHOULDs as warnings.
For instance, we allow users to include non-schema.org keys (or rather, string keys that don't link to any IRI) within their metadata, which is allowed according to strict JSON-LD rules, but recommended against. Here are some questions:
There are other questions, but this set covers the gist of it. Including some misc. references below, such as Best Practices and the official "JSON-LD grammar":
Here are some "best practices" put forth by W3C:
Best Practice 1: Publish data using developer friendly JSON
Best Practice 2: Use a top-level object
Best Practice 3: Use native values
Best Practice 4: Assume arrays are unordered
Best Practice 5: Use well-known identifiers when describing data
Best Practice 6: Provide one or more types for JSON objects
Best Practice 7: Identify objects with a unique identifier
Best Practice 8: Things not strings
Best Practice 9: Nest referenced inline objects
Best Practice 10: When describing an inverse relationship, use a referenced property
Best Practice 11: External references SHOULD use typed term
Best Practice 12: Ordering of array elements
Best Practice 13: Provide a representation of the entity related by URL
Best Practice 14: Cache JSON-LD Contexts
JSON-LD Grammar
(Interesting point from the above grammar: unlinked keys in the JSON-LD MUST be ignored when processed. We may want to remind users that adding unlinked keys to their metadata does not technically add to its richness, since it will be ignored during any official processing on the web)
additional MUSTs that we can glean from the grammar:
note:
data:image/s3,"s3://crabby-images/e6968/e69680f3362db4326d90daf0955078b1714bc77c" alt="Screenshot 2023-11-29 at 1 29 26 PM"
This refers to the eventual deprecation of non-IRI keys in JSON-LD
The text was updated successfully, but these errors were encountered: