turtle/sparql should allow empty comma (and dot?) just as it allows empty semicolon #91

VladimirAlexiev · 2022-04-01T15:38:25Z

Empty semicolon

This is valid:

<s>
  :p1 <o1>;
  :p2 <o2>;
.

It is useful because you don't need to tweak the last statement before you add a new one.

It's widely used eg by TQ (right @HolgerKnublauch?)
It's especially useful when you're generating turtle because you don't need to keep track which is the last statement

Empty comma

But this is not valid:

<s> :p
  <o1>,
  <o2>,
.

Here's a real example where I want it (processed a list of prefixes to make sh:declare):

@base         <https://transparency.ontotext.com/resource/>.
@prefix tr:   <https://transparency.ontotext.com/resource/tr/>.
@prefix dash: <http://datashapes.org/dash#> .

<shape> a owl:Ontology; rdfs:label 'TEKG Shapes';
  sh:declare
    [sh:prefix 'tr'; sh:namespace 'https://transparency.ontotext.com/resource/tr/'],
    [sh:prefix 'dash'; sh:namespace 'http://datashapes.org/dash#'],
.

Empty dot

Similarly, this is not valid:

.
<s> :p <o> 
.

Why would I want an empty dot? Because when generating, it's often easier to know when a block starts not necessarily when it ends.

@afs , @ericprud what do you think?

The text was updated successfully, but these errors were encountered:

HughGlaser · 2022-04-01T15:56:49Z

Ah, sweet memories. https://lists.w3.org/Archives/Public/semantic-web/2008Jan/0073.html Sorry, nothing more to offer, other than Tim & my beards are presumably a lot greyer or even whiter 14 years later :-)

…

On 1 Apr 2022, at 16:38, Vladimir Alexiev ***@***.***> wrote: Empty semicolon This is valid: <s> :p1 <o1> ; :p2 <o2> ; . It is useful because you don't need to tweak the last statement before you add a new one. • It's widely used eg by TQ (right @HolgerKnublauch?) • It's especially useful when you're generating turtle because you don't need to keep track which is the last statement Empty comma But this is not valid: <s> :p <o1> , <o2> , . Here's a real example where I want it (processed a list of prefixes to make sh:declare): @base <https://transparency.ontotext.com/resource/> . @Prefix tr: <https://transparency.ontotext.com/resource/tr/> . @Prefix dash: <http://datashapes.org/dash#> . <shape> a owl:Ontology; rdfs:label 'TEKG Shapes' ; sh:declare [sh:prefix 'tr'; sh:namespace 'https://transparency.ontotext.com/resource/tr/' ], [sh:prefix 'dash'; sh:namespace 'http://datashapes.org/dash#' ], . Empty dot Similarly, this is not valid: . <s> :p <o> . Why would I want an empty dot? Because when generating, it's often easier to know when a block starts not necessarily when it ends. @afs , @ericprud what do you think? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.

TallTed · 2022-04-01T15:57:41Z

I support this change for consistency, if nothing else.

It was very confusing to me when I learned that empty comma was not supported (by trying to use it) while empty semicolon was (which I knew because it was already in use in the Turtle file I was editing). I would find the empty comma quite useful.

I have less strong feelings for, but no strong feelings against, making the empty dot acceptable. I can imagine that it would make things easier in some future situations.

afs · 2022-04-01T16:03:59Z

Hence <s> :p <o1>, ; .

ericprud · 2022-04-01T16:08:13Z

Ditto on @TallTed's lack of strong feelings; ';'s come up a lot more often. @gkellogg wanted to make sure that the Turtle grammar was LL(1) as well as LALR(1) (without re-writing it). If you can branch TurtleAwesome (I think that's pretty close to the spec) and make it parse both ways, that'd be fab. Unfortunately, yacker won't help you 'cause it's only working parsers are LALR.

afs · 2022-04-01T16:28:48Z

There is some argument in favour of trailing , -- I don't have a strong opinion on , because I try to avoid using it at all in favour of predicate-object pairs unless it is a long enumeration or other case that suggests it.

When generating:

   sh:declare [sh:prefix 'tr'; sh:namespace 'https://transparency.ontotext.com/resource/tr/'] ;
   sh:declare [sh:prefix 'dash'; sh:namespace 'http://datashapes.org/dash#'];

not perfect but not that bad. You can always feed it through a pretty printer to sort it out.

; and , are separators. DOT is a terminator, with the oddity it is optional at the end of a block of triples in SPARQL, and it is one terminator per triple, not (N-1) which is what makes generating RDF messy.

DOT is also in N-triples. Having it behave differently in NT and TTL is not good.

What about ( ) . or [ ] . which are also in the same UC of generating RDF.

The use case is in authoring in RDF - then the "new" RDF can not be consumed by old parsers. The issue is then whether a small change like this is worth the effort on the web.

And one more thing: JSON.

gkellogg · 2022-04-01T18:46:16Z

If I update the objectList production (already modified for RDF-star) to allow the second object to be optional, my LL(1) parser continues to parse it.

[8] objectList ::= object annotation? ( "," object? annotation? )*

As I recall, @pchampin also called for this change, although probably in the context of the N3 grammar.

afs · 2022-04-01T19:20:22Z

That seems to match both <obj> , annotation and also <obj> , , , , ,

To stay LL(1), it might have to be recursive:

objectList ::= object annotation? ("," objectList?)?

(If the parser generator supports setting a local lookahead of 2, it is easy to unbundle)

The recursive step only happens if you have seen "," and then if you see objectList the whole thing moves along one step in a way that * does not.

See [55] in SPARQL:

 [55]  	TriplesBlock	  ::=  	TriplesSameSubjectPath ( '.' TriplesBlock? )?

gkellogg · 2022-04-11T21:10:17Z

Sorry, I'm still on holiday, so haven't gotten back to this. But, I did get a chance to try the following production for Turtle:

[8] objectList ::= object annotation? ( "," objectList? )*

Unfortunately, there is still an LL(1) parsing problem. My parser generates the following error:

[rdf-turtle] ebnf --ll1 turtleDoc etc/turtle.bnf
[1]First/Follow Conflict: "," is both first and follow of _objectList_2

There may be some further indirection that can be added to resolve such an error, but we are not bound to have context-free grammars, but it is a nice to have.

ericprud · 2022-04-11T21:16:55Z

It might come down to whether the spec grammar favors LALR or LL (i.e. can you just drop the BNF into a parser generator or do you have to turn the left-reduces into right reduces). My vote would be for LALR since it's still pretty much the gold standard parsing tech.

afs · 2022-04-12T08:30:48Z

[8] objectList ::= object annotation? ( "," objectList? )*

The * should be a ?.

SPARQL rule [55] works in javacc which is LL(1).

gkellogg · 2022-04-12T22:02:16Z

Yes, of course, that makes it a valid LL(1) grammar.

Another alternative is

[8] objectList ::= object annotation? ( "," (object annotation?)? )*

VladimirAlexiev · 2022-04-26T20:24:48Z

@afs
I agree that empty DOT is not useful.

What about ( ) . or [ ] .

These are nodes not triples? Guess these are not complete examples?

Btw I didn't know these nodes (rdf:nil and bnode() respectively) cannot be used in SPARQL expressions and binds.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

turtle/sparql should allow empty comma (and dot?) just as it allows empty semicolon #91

turtle/sparql should allow empty comma (and dot?) just as it allows empty semicolon #91

VladimirAlexiev commented Apr 1, 2022

HughGlaser commented Apr 1, 2022 via email

TallTed commented Apr 1, 2022

afs commented Apr 1, 2022

ericprud commented Apr 1, 2022

afs commented Apr 1, 2022

gkellogg commented Apr 1, 2022

afs commented Apr 1, 2022

gkellogg commented Apr 11, 2022

ericprud commented Apr 11, 2022

afs commented Apr 12, 2022 •

edited

Loading

gkellogg commented Apr 12, 2022

VladimirAlexiev commented Apr 26, 2022

turtle/sparql should allow empty comma (and dot?) just as it allows empty semicolon #91

turtle/sparql should allow empty comma (and dot?) just as it allows empty semicolon #91

Comments

VladimirAlexiev commented Apr 1, 2022

Empty semicolon

Empty comma

Empty dot

HughGlaser commented Apr 1, 2022 via email

TallTed commented Apr 1, 2022

afs commented Apr 1, 2022

ericprud commented Apr 1, 2022

afs commented Apr 1, 2022

gkellogg commented Apr 1, 2022

afs commented Apr 1, 2022

gkellogg commented Apr 11, 2022

ericprud commented Apr 11, 2022

afs commented Apr 12, 2022 • edited Loading

gkellogg commented Apr 12, 2022

VladimirAlexiev commented Apr 26, 2022

afs commented Apr 12, 2022 •

edited

Loading