Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

** within regular expression #3

Open
inkytonik opened this issue Jun 14, 2020 · 1 comment
Open

** within regular expression #3

inkytonik opened this issue Jun 14, 2020 · 1 comment
Labels
enhancement New feature or request

Comments

@inkytonik
Copy link
Owner

inkytonik commented Jun 14, 2020

Peter Höfner reported:

from a theoretical point of view it should be possible to compress the grammar

Test =     
    FN '(' Exp ** ',' ')'    {test1} 
    | FN                       {test2}.

to

Test = FN ('(' Exp ** ',' ')' )? .

This is currently not possible - it seems that ** cannot be inside another regular expression.
PS.: Of course one can drop the parentheses ‘(' and ')’ - kept it to explain why I want to have such a grammar.

PPS.: For the above grammar one can set FN = 'a'. and Exp = 'b'., or anything else.

@inkytonik
Copy link
Owner Author

It’s the option (?) that causes the problem. You can only apply optionality to a single thing, not a sequence as in your compressed grammar.

Theoretically you are correct and there is no theoretical reason to disallow this. sbt-rats does disallow it however, because allowing sequences in this way complicates automatically building the AST.

In the compressed grammar a Test node would need to have an FN child and an Option child. But what is the type of the content of the option? In this case it could be a Vector of Exps, but in general it could be made up of multiple non-terminals. E.g,, something like (A B C)? would have to have three things inside the Option. Since it needs to be Option[T] for some T, it is not clear what the T needs to be. Maybe the tuple type (A, B, C)? But having arbitrary tuples like this moves the tree towards being less typed than is desirable. Particularly when defining further processing it is desirable to have a “real” type for the A x B x C not just a tuple.

Instead, the sbt-rats rule is that sequences can only occur at the top level of an alternative, which is why FN '(' Exp ** ',' ')' is ok. For all other positions you need to introduce another non-terminal and that non-terminal gives a well-defined (i.e., non-tuple) type for the child. So, the following is the recommended style:

Test = FN ArgList? .
ArgList = '(' Exp ** ',' ')'.

@inkytonik inkytonik added the enhancement New feature or request label Jun 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant