-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CONSTRUCT FRAMED #48
Comments
how is this different from sorting the where solution set by whatever variable(s) serve as subject(s)? |
Hi @RickMoynihan! What I miss in your proposal is a reference to JSONLD Frames. When you say "framed" do you mean it in the same sense, or some other sense? I believe you need to describe the shape of the result you want, as it's not obvious how far to traverse the graph and how to lay out the data. Suppose that s1 and s2 share a sub-object x (eg a nomenclature entry): should it be emitted in the frame of s1, or s2, or at top-level and only referenced from s1 and s2? |
If the client doesn't have an RDF library, then why not stick to |
@dydra You're right that it's very similar to doing that; but I think there's a big difference in practice. N-Triples as a format make no promise that this sorting/grouping/framing has happened, so consumer's can't rely on it; i.e. all N-Triples parsers are designed to give you data one triple at a time not a subject at a time. If you know your triples because you also wrote the query then you can certainly implement a grouping parser; that will frame resources to your application however the only framed format I know of right now is JSON-LD frames; however one could imagine many others. The motivation for this proposal goes beyond just SPARQL and to the rest of RDF ecosystem, as I think there would be huge benefit in preserving the fact that an RDF graph is already framed, i.e. in JSON-LD it would be a serialised file with a I feel that for this sort of thing to be maximally useful it would need to be expressable in the query itself, and not just as an |
Yes I co-opted frame from JSONLD Frame; but also I guess it comes from the classical KR "Frame Language" view of the world; i.e. viewing objects as collections of properties, rather than sets of triples describing objects. I'm not claiming frame-oriented is better than triple oriented, just that it also has its uses, and that RDF currently completely ignores this view with the sole exception of JSON-LD framing.
Well I wasn't wanting to design a detailed proposal; that would be the job of the standards committe :-) I just want to express what I believe is an endemic usability problem in the RDF world, and highlight a relatively simple potential solution. However I would imagine that framing to one level deep, would be enough 90% of the time at least to make peoples lives easier, so the following:
Would return something like (pseudo jsonld):
One could imagine a much deeper integration that could map variables directly into JSON object templates. Indeed I've written a sparql-like query library that will map variables from BGPs into a JSON like tree template, though I'm not proposing anything quite that wild as it would be a much much bigger change. |
@cygri No, it says I want the results to be RDF and framed into objects. Is framed JSON-LD no longer RDF? |
@RickMoynihan I think what you're asking for is already standard part of JSONLD
If you read the JSONLD spec, I'm sure you can find a profile header or something to request one or another representation of the query result. Please share your findings here. |
Yes JSONLD can certainly frame things into json objects like I want. However I think the intent for this kind of representation needs to be expressed in the query; and not just as a content-negotiated header. Though currently I can ask an endpoint to return a Hence I believe this would need to be a SPARQL feature. |
JSON-LD frames require the client to specify a specific frame which is like a program by example for structuring a flattened JSON-LD document, which is effectively a simple quads serialization. The concept might be extended to work in a query engine to construct the triples/quads to be framed, but that would require a bit more work. We don’t (currently) have any notion of automatics framing, such as performed by many Turtle serializers, and strictly specifying that might be challenging. |
I still have trouble understanding what problem we are trying to address here. According to the issue description: “Handling raw RDF triples can at times be awkward.” Then why not use There is also Jena's
|
@gkellogg Thanks for chipping in. I only have a limited knowledge of the intricacies of JSON-LD etc, which is why I didn't in the initial issue proposal try to specify how it worked or could use it. I'd assumed that the frame would not be provided by the user, but instead be an internal detail of the database implementation. However if a frame cannot express the kind of grouping a turtle processor might do it clearly rules out JSON-LD as a suitable technology for implementing this. @cygri: Thanks for pointing me at Jena's
Several reasons:
The use cases I had in mind are having lightweight transformations where you want to pull some data out of the database and perform some simple processing without having to use a Model/MemoryStore. In these situations I still want the RDF types and their properties/subjects, and still want to have a representation that describes an RDF graph that could in principle be loaded back into the store. Anyway the fact I'm arguing so much over this one, means it's clearly controversial, or deemed not worth the effort. And of all the 1.2 features it's not the one I'd like to see most. I'd much rather have #31 #49 and I definitely find handling labels and priorities to be a pain #13 (but I can't of a realistic implementation). |
I run a You mention that you'd like to "pass this burden onto the database". This would make sense if the database was better suited for framing than your client application, so that various performance savings (e.g., streaming response, lower bandwidth) can be achieved. What kinds of savings do you think can be made? I believe JSON-LD requires data in memory and the performance characteristics within a database and in a client application. Regarding usability improvements, apart from the "burden" passed onto the database you'd have to pass more information about how should the framing be done; i.e. communication overhead compromising any usability improvements. |
Well firstly the database often runs with more hardware resources than your app. Admitedly often you also want to take load off your database. As always there are tradeoffs. Secondly a database can often sort and group efficiently it may also have mechanisms for terminating resource hungry operations, through timeouts, excessive memory usage for a single query etc. Also spilling to disk etc.
I would've hoped it would always just group on s, and then group on p within each resource; like a beautifying turtle serializer might. Clearly we would not want to pass framing information in or or out of band, unless we were to extend construct with much more complex templating features. |
@RickMoynihan I think you missed part of my reply.
I think that's false, have you checked the jsonld spec? https://w3c.github.io/json-ld-syntax/#example-142-http-request-with-profile-requesting-a-compacted-document |
@RickMoynihan What effect should Another tangent: Did you look at all into querying RDF with GraphQL? Here's a survey by @rubensworks. I don't know what the state of the art is with regard to updating RDF via GraphQL, but in theory, once you have a mapping between the two models established, it should be possible to send a query, get back vanilla JSON, modify the JSON, and send it back where it leads to RDF updates. |
with respect to "... a standardised way to group/frame...", a significant phrase in that passage in the syntax document is
the json-ld 1.1 process deferred the issue, how to specify that in a request. |
Yes, when requesting a framed result, it’s up to the service to determine the framing. We haven’t specified a mechanism for a client to specify a frame, but would likely use a Link and/or Profile header. Also, outside our scope to specif a service behavior. |
@cygri Yes, I spent a lot of time with colleagues designing building and experimenting with that approach with colleagues. However it's a hard circle to square given GraphQL's limited type system, lack of namespaces, and that graphql schemas need to be generated prior to query time (or you break conformance with GraphQL as Stardog does by dropping schemas altogether -- which in my mind removes a lot of the point in doing it).
It's a good question, there would be iff consumers, libraries and API's know by the query type to group resources. However I was imagining a framed RDF format (or JSONLD), which admitedly might be beyond the scope of this CG would be the most useful. |
Thanks. Yes that looks sufficient.
Ok, but we're talking about the SPARQL protocol here right? I'd be surprised if any endpoint honored that. Mandating that endpoints did honor that in SPARQL 1.2 might be sufficient for an implementation of this, though I can't shake the feeling that this needs to be represented in the query itself, as otherwise memory/in-process stores etc won't be able to honor it in a standard way. |
yes, but if an endpoint accepts the response media type, it is bound to follow its definition. |
@dydra except honoring the profile parameter is not required:
|
which is followed by
|
You earlier said the main use case for this is clients that don't include an RDF library. If you have an entire SPARQL engine in-process, then surely there is a straightforward way to access the CONSTRUCT result as, for example, a Jena model, and serialise it to, for example, compacted JSON-LD.
Oooh, that looks very cool. Thanks for the pointer. (At TopQuadrant we have SHACL models for all our data, and use annotations on the SHAQL constructs to define the GraphQL schema. This works quite well for our use cases.) |
Yes, though I don't believe it's contradictory to expect a feature which I'd originally proposed might warrant a SPARQL 1.2 syntax, would also work with the same semantics, when you were operating in process. If only to honor the abstraction.
Yes, we've been thinking about adopting that same approach for a long time. However when we started SHACL had only just been finalised and IIRC there were very few implementations at that time which we could easily leverage. That project was largely just an experiment, though I'd like to rebuild in a much more robust manner. |
@dydra ... but this is a largely moot point as JSONLD isn't mandated as a response format for constructs by SPARQL 1.1 anyway. I suppose this issue could perhaps be fixed enough for my purposes if SPARQL 1.2 implementers were required to implement JSONLD, and they were required to group the objects such that all subjects and properties were normalised into the tree. Though looking some more at the specs for compacting/flattening documents I'm not convinced implementers are required to group like this... though the playground examples appear to. |
if this is a moot point, then i have lost track of what you intend this feature to accomplish. |
@dydra forgive me if I'm misunderstanding you, but I read your point as suggesting the feature request is unnecessary because the JSONLD spec says if you support JSONLD an implementer should support profiles, and that a profile such as compact/flattened would let a consumer ask for the results to be grouped/framed into resource objects as I'd like. I suggested this point on profiles is somewhat moot, because JSONLD isn't a required response format in SPARQL 1.1, therefore even if a JSONLD profile would solve this issue; framing responses in the manner I suggested is not standardised as I thought you were suggesting. I do however agree with you that a combination of JSONLD and a compact/flattened profile may be what I'm after. However though the JSON in the playground looks similar to what I'd like I've not had time to digest the standard documents to confirm that compaction groups jsonld resource maps together by subject/predicate id. |
if a combination of JSONLD and a compact/flattened profile does provide this capability, then would this be a matter of a protocol change rather than a language change? |
Note that JSON-LD compact and framed profiles might be treated equivalently by a service in the absence of any explicit context or frame. The JSON-LD CG May be the group to work on protocol mechanisms to specify the context or frame along with the request, but some group such as a successor to the Linked Data Platform might be best for creating normative requirements. The fact that JSON-LD is not a requirement for a CONSTRUCT representation likely is due to the fact that the SPARQL 1.1 spec predates JSON-LD 1.0. Personally, I’d like to see better integration of JSON-LD in SPARQL, perhaps with a frame-like representation within something like CONSTRUCT. (BTW, my attempt to join this CG is held up due to affiliation issues, which I should resolve in a week or so). |
That's good to know @gkellogg. Were you thinking something similar to Jena's [JSON template extension] (https://jena.apache.org/documentation/query/generate-json-from-sparql.html#query-syntax) that Richard shared? I had thought such a thing would be useful too; and I've experimented implementing a similar mechanism to map triples into arbitrary clojure datastructures; so I understand the desire for such a thing. It would be much more powerful than what I'm suggesting, though I felt it might be much too big a change for SPARQL 1.2; so offered this as a partial solution. |
That's simple, they do Concise Bounded Description: collect all statements of a Subject, and embed blank nodes. Most DESCRIBE implementations do the same. Compare #39 which asks for more sophisticated ways to define a DESCRIBE response. |
@RickMoynihan wrote:
Yes, the preservation of RDF details has come up before. Similarly in GraphQL, sometimes getting the details without losing details would make the client's work easier and other times its "just give me JSON". There is a converse issue as well, the SPARQL JSON results with "plain JSON", not the RDF terms in all their details c.f. the CSV format but for JSON. Obviously not to require full JSON-LD processing, but some integration of JSON-LD would be good, which could be "practice and experience" note, if the machinery already exists and what is needed is roll-out. |
Why?
I don't really understand the ideas raised in #39, but a perhaps smaller but seemingly related problem I've encountered many times is that handling raw RDF triples can at times be awkward. Often you want them framed into objects so you can process resource objects one at a time, and know you have all the requested properties for each object. Often you don't really want to have to process the entire stream of results to group all the triples yourself. If you have a library like RDF4j or JENA you can load your triples into a Model, or memory store; but you may not always have such tools available.
Whilst databases are often already under load, they are frequently better placed to consume results for framing than their clients.
Proposed solution
It would be nice to be able to pass this burden onto the database in some circumstances, i.e. a query of something like:
would return all matches framed into resource objects... e.g. a JSONLD result stream of results grouped into resource objects:
[{,,,}, {,,,}, {,,,}, {,,,}]
.Such a proposal would require
CONSTRUCT FRAMED
queries to use a response format that can handle the framing, i.e. they would require a frame oriented format (something like JSON(LD)/XML). Technically fully beautified turtle could also fulfill the requirement, however turtle is typically read in a triple oriented manner not a resource oriented one; and the point is to guarantee consumers can process each subject one at a time.Previous work
Considerations for backward compatibility
It requires an additive change to syntax.
The text was updated successfully, but these errors were encountered: