Skip to content
This repository has been archived by the owner on Oct 17, 2024. It is now read-only.

Scanner and contiguous sequence readers for symbolic expression lexer #29

Merged
merged 24 commits into from
Nov 14, 2018

Conversation

tealeg
Copy link
Contributor

@tealeg tealeg commented Nov 9, 2018

This PR defines the scanner for the symbollic expression language to be deployed in the Regula UI.

There are essentially 3 layers to this:

  • A Scanner struct that wraps a io.Reader in a bufio.Reader and uses it to read the stream a rune at a time (supporting backtracking as needed).
  • The scanner provides primitives to readRune and unreadRune which wrap the bufio.Reader methods of the same name - maintaining positional information as they go.
  • The Scanner.Scan method grabs the next rune and identifies it. If it is a simple control character (i.e. parenthesis) it returns the token for that control character. If it implies a contiguous block, it hands this work off to a specialised scan function.

The specialised scan functions make up the bulk of the work. Mostly they are simply looping looking for a character that terminates the sequence. scanNumber however is somewhat more complex. I have chosen to handle the conflict between the use of - to indicate a negative number and it's use as an operator in this part of the code. This decision makes the scanner more complicated, but simplifies things in the parser we will add later - we have to make one of these two layers somewhat "impure" so I chose to do it at the more fundamental of the two, meaning the parser can be implemented without this wart.

Fixes: #16

@tealeg tealeg added the enhancement New feature or request label Nov 9, 2018
@tealeg tealeg added this to the v0.6.0 milestone Nov 9, 2018
@tealeg tealeg self-assigned this Nov 9, 2018
@tealeg tealeg requested review from asdine and yaziine November 9, 2018 17:38
Copy link
Contributor

@asdine asdine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Just a few comments

}

// Valid number parts are written to the buffer
if isNumber(rn) || rn == '.' {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this .3..14...1....5.....9..... won't trigger an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asdine you do understand it correctly - we'll validate that it really is a valid number when we do type checking in the parser.

rule/sexpr/lexer.go Show resolved Hide resolved
rule/sexpr/lexer.go Outdated Show resolved Hide resolved
@asdine asdine removed this from the v0.6.0 milestone Nov 12, 2018
rule/sexpr/lexer.go Outdated Show resolved Hide resolved
Copy link

@yaziine yaziine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! 💪👏👍

@tealeg tealeg merged commit 89cc00c into release-v0.6.0 Nov 14, 2018
@tealeg tealeg deleted the lexical-scanner branch November 14, 2018 08:41
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants