SPARQL-friendly lists #46

dbooth-boston · 2018-12-07T03:44:41Z

It is very hard[7] to query RDF
lists, using standard SPARQL, while returning item ordering.
This inability to conveniently handle such a basic data
construct seems brain-dead to developers who have grown to
take lists for granted.

"On my wish list are . . . generic structures like nested lists as first class citizens"
https://lists.w3.org/Archives/Public/semantic-web/2018Nov/0170.html

IDEA: Jena's list:index property

Apache Jena offers one potential (though non-standard)
way to ease this pain, by defining a list:index property:
https://jena.apache.org/documentation/query/rdf_lists.html

IDEA: Add lists as a fundamental concept in RDF

As proposed by David Wood and James Leigh
prior to the RDF 1.1 work.[8]
https://www.w3.org/2009/12/rdf-ws/papers/ws14

william-vw · 2018-12-10T13:12:49Z

+1M. See also here (issue 3): http://manu.sporny.org/2014/json-ld-origins-2/ ...

Note that it could be straightforward to add extra semantics, i.e., on top of a triple-based representation, to implement these kinds of list predicates.

VladimirAlexiev · 2019-01-30T17:42:06Z

+1 . cc @azaroth42

RickMoynihan · 2019-04-04T11:21:39Z

SHACL also makes use of lists to express paths, so improving list support might make SHACL processing easier too.

jaw111 · 2019-04-08T07:48:18Z

Would the VALUES OF syntax proposed by @cygri in #6 be appropriate here?

Example:

VALUES (?item ?idx) OF splitList(("travel" "iceland" "winter"))

The returned results are equivalent to:

VALUES (?item ?idx) {
    ("travel" 1)
    ("iceland" 2)
    ("winter" 3)
}

cygri · 2019-04-08T08:34:53Z

@jaw111 I don't quite understand the syntax you're using here. The proposal for VALUES OF only allows normal SPARQL expressions as arguments of the multi-value function, so a list wouldn't be allowed there.

Is the intention to use it like splitList(?x) where ?x would have been earlier bound to the first blank node of a list in the active graph? So, data:

<articles/1234> ex:tagList ("travel" "iceland" "winter").

And query:

SELECT ?tag ?idx {
    <articles/1234> ex:tagList ?tags
    VALUES (?tag ?idx) OF listMembers(?tags)
}

With the result you gave. This would cover the functionality provided by Jena's list:member and list:index property functions.

tayloj · 2019-04-08T15:38:50Z

Some discussion on the mailing list about length-bounded property paths seems relevant too, since a path like ?list rdf:rest{n}/rdf:first ?item returns the nth element of a list (with zero-based indexing).

jaw111 · 2019-04-08T18:30:29Z

@cygri you are correct, using a list there does not make much sense. Must have missed a trick earlier.

@tayloj expanding on your suggestion, how about using a variable instead of an integer for the path length? So a path like ?list rdf:rest{?n}/rdf:first ?item returns a set of solutions where the ?n variable is bound to the index.

tayloj · 2019-04-08T18:58:18Z

@jaw111 That'd certainly be useful, but I have no idea how feasible it is. I think there were already difficulties in implementing {n,m} quantifiers efficiently even with fixed values. Moving to a variable is probably even more complicated. But I'd definitely use it if it were available.

TallTed · 2019-04-30T18:59:07Z

I think there were already difficulties in implementing {n,m} quantifiers efficiently even with fixed values.

FWIW, Virtuoso still supports the {n,m} property path quantifiers. (This is not a comment on the rdf:rest{?n} suggestion from @jaw111.)

kasei · 2019-05-01T15:33:57Z

@TallTed does Virtuoso use the bag semantics of expanding that to a BGP/union equivalent, or the set semantics of just limiting the length of a + path?

jaw111 · 2019-05-01T16:06:21Z

I know @dydra also supports that syntax, including {n,} and {,m} cases.

…

-------- Original message -------- From: Gregory Todd Williams <[email protected]> Date: 01/05/2019 18:34 (GMT+02:00) To: w3c/sparql-12 <[email protected]> Cc: John Walker <[email protected]>, Mention <[email protected]> Subject: Re: [w3c/sparql-12] SPARQL-friendly lists (#46) @TallTed<https://github.com/TallTed> does Virtuoso use the bag semantics of expanding that to a BGP/union equivalent, or the set semantics of just limiting the length of a + path? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#46 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AAS5B7PWDVETIYIACBJCXALPTGZ6RANCNFSM4HDNJYRA>.

TallTed · 2019-05-01T16:09:13Z

does Virtuoso use the bag semantics of expanding [the {n,m} property path quantifiers] to a BGP/union equivalent, or the set semantics of just limiting the length of a + path?

@kasei - Good question, to which I don't immediately have the answer. @IvanMikhailov or @kidehen may be able to shed some light.

kidehen · 2019-05-01T16:52:37Z

@kasei ,

Are we talking about what's exemplified by the following query?

SELECT DISTINCT  * 
WHERE { 
        ?s a <http://dbpedia.org/ontology/AcademicJournal> ; 
        rdf:type{1,3} ?o 
       } 

LIMIT 50

Live Results Link.

/cc @TallTed

kasei · 2019-05-01T19:37:45Z

@kidehen Yes, except for the DISTINCT which will mask the difference. It seems that it's using the bag semantics of BGP/union expansion, which can have some challenges with cardinality for larger values of the path quantifiers (and as I recall was one of the big issues that prevented this from being included in SPARQL 1.1).

kidehen · 2019-05-01T20:11:42Z

@kidehen Yes, except for the DISTINCT which will mask the difference. It seems that it's using the bag semantics of BGP/union expansion, which can have some challenges with cardinality for larger values of the path quantifiers (and as I recall was one of the big issues that prevented this from being included in SPARQL 1.1).

Okay, here's the query solution link without DISTINCT :)

ktk · 2019-10-14T09:23:24Z

We use sh:in for validation of data cubes in RDF. Unfortunately it is pretty much impossible to generate such a list in SPARQL, at least I could not figure out how.

The list functions in Jena seem to be accessing lists only, not manipulating or creating them. Is there any prior work somewhere about how creating and manipulating could look like?

I have not much know how about designing such things but what I tried doing (and failed) was:

CONSTRUCT {
  <something> sh:in ( ?listMembers ) .
}

So pretty much using the Turtle collection syntax. So ?listMembers could be a normal set, if we use SELECT subquery it could also be ordered before using it in the CONSTRUCT. Also I would imagine that I can add more variables, like I can add more entries in Turtle syntax.

Am I completely missing something here that prevents this approach from working?

There is obviously more missing, like removing an entry and adding a new entry but I'm not sure how much of it is realistic in a language like SPARQL.

By the way why is this called collection in Turtle and not list?

ktk · 2019-10-14T09:40:41Z

Bob DuCharme had a blog post that showed some standard manipulations. Works to some extend but is not really nice form syntactic sugar point of view http://www.snee.com/bobdc.blog/2014/04/rdf-lists-and-sparql.html

ktk · 2019-10-15T11:15:39Z

Some tries by @jaw111 https://gist.github.com/jaw111/1b149fd1111f774a3613f10955686617 via Twitter

afs · 2019-10-15T12:25:30Z

From some time ago: https://afs.github.io/rdf-lists-sparql . Lesson - it's painful.

One avenue is to add to the basic SPARQL data model - lists and sets (and paths) - beyond RDF terms. This is a large change, including result set formats, but I think it is worth exploring.

ericprud · 2019-10-21T10:49:23Z

In SWObjects, I extended triple pattern matching with some generators. One of those was "MEMBERS(?var)" (example use) which joined the current binding with the argument (?var above) bound to each member of the list.

I mentioned it to Lee F during the SPARQL 1.1 WG and he said the syntax give him hives. I used this a lot, especially from the command line to e.g. sequentially walk test manifest entries, with no skin conditions that couldn't be explained by prolonged puberty.

TallTed · 2019-10-21T14:35:26Z

@ktk

By the way why is this called collection in Turtle and not list?

List is most commonly understood to mean ordered list, while collection is most commonly understood to mean unordered list. (Yes, both list and collection may have both ordered and unordered variants, but the most common intuitive default is as I said.) Unordered membership is far easier to handle due to various other aspects of RDF and DBMS, and for many reasons (not least being WG time constraints) that ease was important in the development of these specs.

JervenBolleman · 2019-10-23T09:00:04Z

Stardog talked about a possible extension at least to have a list equivalent to group_concat which would affect the result formats more than anything else.

ktk · 2019-10-25T08:51:35Z

@JervenBolleman that is an interesting one, thanks. We have a workaround where we create lists in a coded step after concatenating them with GROUP_CONCAT in SPARQL so that feels very natural to me. Some questions based on that proposal:

how would this be handled in CONSTRUCT, simply as a collection?
Could I "concat" arrays?
wouldn't it make sense to have something like slice() (see MDN) instead of get()?

albertmeronyo · 2019-10-26T06:36:14Z

As per @ktk 's suggestion I'm linking here the slides I used today at ISWC to talk about our work on RDF Lists: https://www.slideshare.net/albertmeronyo/modelling-and-querying-lists-in-rdf-a-pragmatic-study

I went into the presentation unaware of this thread :-) So I just subscribed cc/ @enridaga

ktk · 2021-12-11T17:37:31Z

Just noticed that Stardog provides nice basic member functions for lists, I like what I see https://docs.stardog.com/query-stardog/#rdf-list-functions

ericprud · 2021-12-12T15:23:10Z

Just noticed that Stardog provides nice basic member functions for lists, I like what I see https://docs.stardog.com/query-stardog/#rdf-list-functions

It seems to me that if you have the freedom to extend SPARQL, there are good reasons to write these as operators in the query language rather than as magic predicates embedded in triple patterns:

leverage syntax and function composition, e.g. BIND (LENGTH(:literalList) AS ?length) instead of :literalList stardog:list:length ?length. The former can be combined with any other function available in SPARQL 1.2.
separate SPARQL operations from asserted triples. The magic triple representation is shorter, but it can be easily missed when nestled in with a bunch of triple constraints which correspond to asserted triples. In addition to aiding human recognition, it will be easier to verify completeness of query re-writers (e.g. SPARQL to SQL) if these operations have their own syntactic constructs.
reject unsupported queries. A SPARQL 1.1 engine will reject a query with a LENGTH operator while it would silently fail to match a query with a stardog:list:length predicate.

One advantage to magic predicates is that such a query can pass seamlessly through a naive SPARQL pipeline processor (e.g. a tool which parses the query for bound variables, issues it verbatim, and renders the results in a nice HTML table). Unless SPARQL 1.2 were committed to being syntactically compatible with SPARQL 1.1, I don't think syntactic compatibility of list features compensates for the advantages of SPARQL list operators.

namedgraph · 2022-09-04T10:47:13Z

Pat Hayes on first-class list semantics (or the lack of it):

https://lists.w3.org/Archives/Public/semantic-web/2022Sep/0001.html

VladimirAlexiev · 2022-12-09T15:00:20Z

Use case: convert SHACL prop attachmetns to domain/range.

Very easy to do for schema:domainIncludes, schema:rangeIncludes because these are polymorphic (multivalued):

insert {
  ?prop schema:domainIncludes ?domain; schema:rangeIncludes ?range
} where {
  {[a sh:NodeShape; sh:property/sh:path ?prop; sh:targetClass ?domain]} union
  {[a sh:PropertyShape; sh:path ?prop; sh:class|sh:datatype ?range]} 
}

Much harder to do for RDFS+OWL because one needs to construct lists, eg

:propP   rdfs:domain         [a owl:Class; owl:unionOf (:classX :classY :classZ)].

@jaw111's example https://gist.github.com/jaw111/1b149fd1111f774a3613f10955686617 shows how to do a similar thing (but produces SHACL as final result, and I think it's a bit erroneous).

dbooth-boston transferred this issue from w3c/EasierRDF Apr 3, 2019

JervenBolleman added protocol improving sending queries over the wire query Extends the Query spec labels Apr 4, 2019

afs removed the protocol improving sending queries over the wire label May 1, 2019

laurentlefort mentioned this issue Oct 26, 2019

Decouple membership between classification items and levels from the levels and/or items themselves linked-statistics/xkos#129

Open

rubensworks mentioned this issue Mar 25, 2020

support for rdf:List rubensworks/GraphQL-LD.js#9

Open

dbooth-boston mentioned this issue May 18, 2020

Arrays as a built-in type (was SPARQL-friendly lists) w3c/EasierRDF#74

Open

namedgraph mentioned this issue Oct 6, 2021

Enhanced support for lists in queries, results, and graph mutations #151

Open

relu91 mentioned this issue Aug 24, 2022

Fix shacl, context and ontology w3c/wot-thing-description#1684

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPARQL-friendly lists #46

SPARQL-friendly lists #46

dbooth-boston commented Dec 7, 2018

william-vw commented Dec 10, 2018

VladimirAlexiev commented Jan 30, 2019

RickMoynihan commented Apr 4, 2019

jaw111 commented Apr 8, 2019

cygri commented Apr 8, 2019

tayloj commented Apr 8, 2019

jaw111 commented Apr 8, 2019

tayloj commented Apr 8, 2019

TallTed commented Apr 30, 2019 •

edited

Loading

kasei commented May 1, 2019

jaw111 commented May 1, 2019 via email

TallTed commented May 1, 2019

kidehen commented May 1, 2019

kasei commented May 1, 2019

kidehen commented May 1, 2019

ktk commented Oct 14, 2019

ktk commented Oct 14, 2019

ktk commented Oct 15, 2019

afs commented Oct 15, 2019 •

edited

Loading

ericprud commented Oct 21, 2019

TallTed commented Oct 21, 2019

JervenBolleman commented Oct 23, 2019 •

edited

Loading

ktk commented Oct 25, 2019

albertmeronyo commented Oct 26, 2019

ktk commented Dec 11, 2021

ericprud commented Dec 12, 2021

namedgraph commented Sep 4, 2022

VladimirAlexiev commented Dec 9, 2022

SPARQL-friendly lists #46

SPARQL-friendly lists #46

Comments

dbooth-boston commented Dec 7, 2018

IDEA: Jena's list:index property

IDEA: Add lists as a fundamental concept in RDF

william-vw commented Dec 10, 2018

VladimirAlexiev commented Jan 30, 2019

RickMoynihan commented Apr 4, 2019

jaw111 commented Apr 8, 2019

cygri commented Apr 8, 2019

tayloj commented Apr 8, 2019

jaw111 commented Apr 8, 2019

tayloj commented Apr 8, 2019

TallTed commented Apr 30, 2019 • edited Loading

kasei commented May 1, 2019

jaw111 commented May 1, 2019 via email

TallTed commented May 1, 2019

kidehen commented May 1, 2019

kasei commented May 1, 2019

kidehen commented May 1, 2019

ktk commented Oct 14, 2019

ktk commented Oct 14, 2019

ktk commented Oct 15, 2019

afs commented Oct 15, 2019 • edited Loading

ericprud commented Oct 21, 2019

TallTed commented Oct 21, 2019

JervenBolleman commented Oct 23, 2019 • edited Loading

ktk commented Oct 25, 2019

albertmeronyo commented Oct 26, 2019

ktk commented Dec 11, 2021

ericprud commented Dec 12, 2021

namedgraph commented Sep 4, 2022

VladimirAlexiev commented Dec 9, 2022

TallTed commented Apr 30, 2019 •

edited

Loading

afs commented Oct 15, 2019 •

edited

Loading

JervenBolleman commented Oct 23, 2019 •

edited

Loading