Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

potential ambiguity for empty values and flexspace #3

Open
CosmicToast opened this issue Mar 17, 2021 · 7 comments
Open

potential ambiguity for empty values and flexspace #3

CosmicToast opened this issue Mar 17, 2021 · 7 comments

Comments

@CosmicToast
Copy link
Member

consider the following text:

a = b
c = d

so far so good
however, let's remove the "b" (we intend to make a an empty value)

a =
c = d

this actually becomes a => "c=d"

However, if we turn this into:

a = # comment
c = d

this becomes a => "", c => "d"

While we do discourage people from using flexspace in this way, it may be useful to explicitly recommend using raw values for empty strings.

@Johann150
Copy link
Contributor

Johann150 commented Mar 17, 2021

How the second example is interpreted seems like a bad idea in the first place TBH. I would suggest that there has to be at least one non-whitespace character on the same line otherwise it should count as an empty value. In other words: A value can contain vertical whitespace, but not at the beginning or end.

@CosmicToast
Copy link
Member Author

In other words: A value can contain vertical whitespace, but not at the beginning or end.

This physically can't work.
If a value can contain vertical whitespace, it will never finish.
You might also notice that there's no vertical whitespace anywhere here.

The reason this happens is that we count \s (whitespace in general) as insignificant (i.e flexspace).
The tokens in a kv (raw or not) are key, = and value.
However, we can have arbitrary whitespace in between each token.
In other words, we can't have an empty value (without a comment, which is not whitespace, and makes the value-looking terminate), since it'll just consume the next line.

I would suggest that there has to be at least one non-whitespace character on the same line, otherwise it should count as an empty value.

But again, that's not the problem.
The problem is that when we skip whitespace after the =, we include skipping over vertical whitespace.

One way of fixing this would be to specify that whitespace (that may be skipped over) must be horizontal only, and use vertical whitespace for delineation.
However, this means that behaviors such as:

[
    cursed.section
]

are now impossible.

It also complicates the parser significantly, since we have to explicitly throw a bunch of \v around.

@Johann150
Copy link
Contributor

Another option would be to disallow = in values. This would make it a syntax error:

a = 
c = d

Since whitespace is insignificant, this is identical to:

a = c
= d

which looks like a syntax error to me.
(side note: could meaningfully be interpreted as assigning the value to both keys)

If you really wanted an equal sign in your value you could still use raw values:

a =
`c = d`

@CosmicToast
Copy link
Member Author

that's a way of handling it, but (imo) it's more common to have an = in a value than it is to have explicitly empty values (which is the tradeoff here)
=s in values are actually really common, from what I've seen around
I mean look at any systemd thing, eh?

on the other hand, values are empty "by default", so a comment effectively replaces it, and you only need one whenever you need to override a previous value (which isn't never, but it's not nearly as common)
users might want to add them in pre-emptively though...

another way would be to change the kv (not rkv) type to a token (i.e whitespace-significant) and require only horizontal whitespace around the =

I'm thinking about other potential solutions, still

@Johann150
Copy link
Contributor

Makes me think of how people often write out empty Yacc or bison grammar rules: You can write nothing (the drawback being that people, not computers in this case, might misunderstand you), or you can write a comment /* empty */, or even use a special piece of syntax %empty. See https://www.gnu.org/software/bison/manual/html_node/Empty-Rules.html.

We could take that same recommendation to use #empty. The difference to Yacc/bison being that without it, it might be perfectly readable for humans, but not computers. It feels similarly hackish as changing kv but not rkv.

@CosmicToast
Copy link
Member Author

That sounds like a good idea.
The thing is, we already actively discourage people from throwing random vertical whitespace into KVs (it's that way purely to make the grammar smaller/simpler).
So I'm thinking I'm going to add a "Gotchas" section to the README and SPEC files, and any irregularities like this (along with the recommendations on how to deal with them).
It would go something like this:


Gotchas

Due to the focus of grammar simplicity, a few minor things may be unintuitive.
Here's is a current list, as well as recommendations on dealing with them.

Empty Values

An empty kv may not do what you want it to do.
Consider:

a =
b = c

This will actually set the value of "a" to "b = c" verbatim.

Here are the recommended ways of dealing with (explicitly empty values).
Using a raw value:

a = ``
b = c

This makes your intent explicit.

Adding a comment:

a = # empty
b = c

This works around the parser, and the comment may explain why the value should be empty.

Another possibility to consider is simply not setting the value to begin with, i.e

# a =
b = c

Not setting a value at all means you cannot override previous values, but should (for most intents and purposes) be equivalent.


How does that sound?
This way we can keep the simpler grammar, and for most use-cases this is ultimately fine.

@Johann150
Copy link
Contributor

Johann150 commented Mar 21, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants