Change inner loops to use int not YY_CHAR, removing need for separate NUL table #370
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I am interested in creating scanners with %option noecs nometa-ecs, it removes several lookups from the scanner inner loop, and thus it may be a good compromise between compressed and full tables.
Also, I'm studying the table format closely, in order to teach myself how flex works (this was originally why I turned off equivalence classes, i.e. to simplify the tables to make them more human-readable).
In the process I noticed several things about the generated code.
Firstly, the "jam" state, i.e. the last N entries in the "yy_nxt" and "yy_chk" tables, contained an unused entry, it is generated with 257 transitions to itself rather than only 256, and the 257th could never be accessed (according to the inner loop code I saw). So I wanted to remove this, not that it is a big space issue or anything, but mainly just because I found it a bit confusing and I wanted to tighten things up.
Secondly, the code of the function yy_try_NUL_trans(), was slightly unfortunate as shown here:
We can see that the yy_is_jam variable is completely unnecessary in this particular combination. But looking at the code in "gen.c" which generates this routine, it is clear why it generates such code (since there are various options for the first and second block of code, interfaced by the yy_is_jam variable).
The code in yy_get_previous_state() is also not totally ideal as it combines basically the code from the ordinary inner loop plus the code from yy_try_NUL_trans(), via an if/else, executed on each character.
All of these things occur because of a decision made in "nfa.c" about whether to generate the NUL transition table, which it wouldn't normally do when equivalence classes are in use, but it does in this case, to accommodate the fact that there are 0x101 characters including the end-of-buffer character.
In my opinion a better way is to make the inner loop able to use 0x101 characters directly, so that the NUL transitions can be stored in the ordinary transition tables (in the 0x100 spot, which is not ideal but there are justifiable reasons for it; perhaps a future pull request could add options to remove the end-of-buffer character and associated optimizations, for applications where simplicity is better than speed).
Indeed the comments in "nfa.c" suggest the same possibility, although it wasn't implemented at the time.
So I went ahead and made the changes and it appears to work. I also checked the generated assembly code for the scanner inner loop before and after the change and it appears to be a slight improvement (whether or not equivalence classes are in use). I've listed the reason for this in a comment in "gen.c".
I've attached an example lex.yy.c before and after the change, you can "diff" them to see what has changed, but the main changes are in the yy_try_NUL_trans() and yy_get_previous_state() routines. I did this test on "scan.l" using the current release version of flex, not the latest development version. This is because I don't have the latest autotools handy, so please do re-test the pull request if it is accepted.
In the attached example files, we can see that the "jam" state is now at yy_chk[] location 30271 not 30174, reflecting that the states are now slightly larger which cost about 100 words, but on the other hand the yy_NUL_trans[] array which was previously 1113 words, is no longer required, a significant saving. The downside is that recovering from NULs in the input now requires a compressed table lookup, which is slower. If this was an issue I'd suggest to remove the end-of-buffer optimizations altogether.
lex.yy.c.zip