You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have hacked up a change (emphasis on the "hack" -- it is not at all PR-quality code right now) which enables garbage collection during the course of parsing. In particular, it leaves the GC enabled while the shell is collecting input to be parsed. This should, at least in theory, make it possible to run user-defined es code during input, enabling things like programmable completions (or, more extremely, I'd love to see if it were possible to define an all-es readline alternative.) The catch is that implementing this required changing parser generator engines entirely: I swapped out the current yacc/bison parser for the lemon parser generator.
Lemon is the parser generator used in the SQLite project. It is an LALR(1) parser generator like yacc and friends, and its grammar syntax is largely the same as well (I had to change a lot of lines of the grammar file, but those changes were generally mechanical). It is also public domain, written in mostly-library-less C89, and intended to be included directly in the source tree of other projects, which means it shouldn't add any meaningful portability burden.
The features which make lemon compelling are these:
Generated lemon parsers are "push-style", instead of "pull-style". This means that rather than calling yyparse() and letting it "pull" tokens from the input as it needs, the caller instead takes tokens from the input itself, and then "pushes" those tokens into the parser in a loop until either the tokens run out or the parser signals that a full valid statement has been given. This means that input happens without any live parser code in the stack -- all parser state is encapsulated into a single parser object.
Lemon enables a high degree of control of the generated code. Generated logic from the grammar file is injected into a template file lempar.c which can be supplied by the caller. This means that the parser object from the first item can be inspected and even augmented with application-specific things -- for instance, garbage collection routines, which allow the parser to be Refd.
These two things together make it so the parser-calling bit of code in input.c can look, roughly, like
Ref(Parser *, parser, mkparser());
do {
token = yylex();
yyparse(parser, token);
} while (parser->state == PARSE_CONTINUE);
parsetree = parser->parsetree;
RefEnd(parser);
and yylex() (and the fill functions that it calls to fetch input) can run with the GC enabled!
I think this is very exciting. Interactive features in general have become a major Achilles' heel for es in the age of friendly, interactive shells. However, this a big change, and switching from what is essentially The Standard parser generator to one that is significantly more obscure is a decision that should be made carefully. So, I'm curious what people think about it.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I have hacked up a change (emphasis on the "hack" -- it is not at all PR-quality code right now) which enables garbage collection during the course of parsing. In particular, it leaves the GC enabled while the shell is collecting input to be parsed. This should, at least in theory, make it possible to run user-defined es code during input, enabling things like programmable completions (or, more extremely, I'd love to see if it were possible to define an all-es readline alternative.) The catch is that implementing this required changing parser generator engines entirely: I swapped out the current yacc/bison parser for the lemon parser generator.
Lemon is the parser generator used in the SQLite project. It is an LALR(1) parser generator like yacc and friends, and its grammar syntax is largely the same as well (I had to change a lot of lines of the grammar file, but those changes were generally mechanical). It is also public domain, written in mostly-library-less C89, and intended to be included directly in the source tree of other projects, which means it shouldn't add any meaningful portability burden.
The features which make lemon compelling are these:
Generated lemon parsers are "push-style", instead of "pull-style". This means that rather than calling
yyparse()
and letting it "pull" tokens from the input as it needs, the caller instead takes tokens from the input itself, and then "pushes" those tokens into the parser in a loop until either the tokens run out or the parser signals that a full valid statement has been given. This means that input happens without any live parser code in the stack -- all parser state is encapsulated into a singleparser
object.Lemon enables a high degree of control of the generated code. Generated logic from the grammar file is injected into a template file
lempar.c
which can be supplied by the caller. This means that theparser
object from the first item can be inspected and even augmented with application-specific things -- for instance, garbage collection routines, which allow theparser
to beRef
d.These two things together make it so the parser-calling bit of code in
input.c
can look, roughly, likeand
yylex()
(and thefill
functions that it calls to fetch input) can run with the GC enabled!I think this is very exciting. Interactive features in general have become a major Achilles' heel for es in the age of friendly, interactive shells. However, this a big change, and switching from what is essentially The Standard parser generator to one that is significantly more obscure is a decision that should be made carefully. So, I'm curious what people think about it.
Beta Was this translation helpful? Give feedback.
All reactions