Introduce `BadNode` and `MultiError` #230

makenowjust · 2024-12-18T07:18:30Z

This PR introduces BadNode and MultiError.

BadNode is a placeholder node for a source fragment containing syntax errors.
MultiError is an error consisting of a list of *memefish.Error.
Now, the parser can report multiple errors and return an AST node even if errors are reported.

Example:

$ go run ./tools/parse/main.go 'select (1 +) * (% 2)'
--- Error
syntax error: :1:12: unexpected token: )
  1|  select (1 +) * (% 2)
   |             ^
syntax error: :1:17: unexpected token: %
  1|  select (1 +) * (% 2)
   |                  ^

--- AST
&ast.QueryStatement{
  Query: &ast.Select{
    Results: []ast.SelectItem{
      &ast.ExprSelectItem{
        Expr: &ast.BinaryExpr{
          Op:   "*",
          Left: &ast.ParenExpr{
            Lparen: 7,
            Rparen: 11,
            Expr:   &ast.BadExpr{
              BadNode: &ast.BadNode{
                NodePos: 8,
                NodeEnd: 11,
                Tokens:  []*token.Token{
                  &token.Token{
                    Kind: "<int>",
                    Raw:  "1",
                    Base: 10,
                    Pos:  8,
                    End:  9,
                  },
                  &token.Token{
                    Kind:  "+",
                    Space: " ",
                    Raw:   "+",
                    Pos:   10,
                    End:   11,
                  },
                },
              },
            },
          },
          Right: &ast.ParenExpr{
            Lparen: 15,
            Rparen: 19,
            Expr:   &ast.BadExpr{
              BadNode: &ast.BadNode{
                NodePos: 16,
                NodeEnd: 19,
                Tokens:  []*token.Token{
                  &token.Token{
                    Kind: "%",
                    Raw:  "%",
                    Pos:  16,
                    End:  17,
                  },
                  &token.Token{
                    Kind:  "<int>",
                    Space: " ",
                    Raw:   "2",
                    Base:  10,
                    Pos:   18,
                    End:   19,
                  },
                },
              },
            },
          },
        },
      },
    },
  },
}

--- SQL
SELECT (1 +) * (% 2)

apstndb · 2024-12-18T08:53:17Z

parser.go

+	l := p.Lexer.Clone()
+	defer func() {
+		if r := recover(); r != nil {
+			// When parsing is failed on tryParseHint or tryParseWith, the result of these methods are discarded


I have concerning discarding hints and CTEs means input statement can't be recovered by SQL(). I think it don't satisfy users needs. I believe that users expect parse-unparse is safe operation even if BadNode.

Parsing results are discarded, but the contents (tokens) are preserved. Therefore, your concern is not a problem.

I mean, parse-unparse is still safe after this PR.

Thank you for explanation.
I understood all handleParse*Error will eat tokens between l.Token.Pos and their stop position.

apstndb · 2024-12-20T05:50:54Z

I feel like this specification hasn't been fully discussed, and I have some personal concerns about the current behavior.

As a memefish contributor, it's clear that this change introduces new complexity. I believe there should be a strong enough benefit to justify introducing this complexity.

Furthermore, as a developer of tools that use memefish, I don't find the current specification sufficiently useful. Because Badnode implements too many interfaces, it cannot be used effectively for type assertions. In fact, it has the potential to introduce bugs, so I would likely choose to avoid using BadNode altogether.

My suggestion is to break down the Node into smaller, more specific types for implementing interfaces. If we did this, it would be genuinely more useful than before this PR, and I might actually use it going forward.

PoC branch: feature/bad-node...apstndb:memefish:feature/poc-bad-nodes

Typical code with my usecase
feature/bad-node HEAD https://go.dev/play/p/iOLRu_m1q3F
apstndb/feature/bad-nodes HEAD https://go.dev/play/p/lsXS1D-S8Q3
master HEAD https://go.dev/play/p/OydF15UWANp

apstndb · 2024-12-20T06:26:29Z

Yeah, I may prefer ParseStatements() rather than ParseStatement() for this usecase.

feature/bad-node HEAD: https://go.dev/play/p/P0SgHl6vUGg
apstndb/feature/poc-bad-node HEAD: https://go.dev/play/p/jnujP1DM2vo

#230 (comment) Co-Authored-By: apstndb <[email protected]>

#189 (comment)

#230 (comment) Co-Authored-By: apstndb <[email protected]>

makenowjust force-pushed the feature/bad-node branch from 393232a to d14fe1b Compare December 18, 2024 07:34

apstndb reviewed Dec 18, 2024

View reviewed changes

makenowjust force-pushed the feature/bad-node branch from ec86532 to 20e2973 Compare December 18, 2024 10:46

makenowjust added a commit that referenced this pull request Dec 20, 2024

Add Bad{Statement,QueryExpr,Expr,Type,DDL,DML} ASTs

2c0d969

#230 (comment) Co-Authored-By: apstndb <[email protected]>

makenowjust force-pushed the feature/bad-node branch from 20e2973 to 2c0d969 Compare December 20, 2024 10:52

makenowjust added a commit that referenced this pull request Dec 20, 2024

Add Bad{Statement,QueryExpr,Expr,Type,DDL,DML} ASTs

1f8072c

#230 (comment) Co-Authored-By: apstndb <[email protected]>

makenowjust force-pushed the feature/bad-node branch from 8dbc8c8 to 748f3e0 Compare December 20, 2024 12:30

makenowjust changed the title ~~Introduce BadNode and ErrorList~~ Introduce BadNode and MultiError Dec 21, 2024

makenowjust and others added 11 commits December 21, 2024 09:38

Introduce BadNode and ErrorList

b1a61cb

Fix ErrorList.String

e58aab4

Fix test to expect parsing errors for testdata/*/bad_*.sql

7c55114

Use direct returns instead

255fbd3

#189 (comment)

Make the bad_ prefix more evil

b13dd34

Store tokens in BadNode instead of a raw string

2cc7998

Rename noError to noPanic

f2ffbaf

Add Bad{Statement,QueryExpr,Expr,Type,DDL,DML} ASTs

5ee624d

#230 (comment) Co-Authored-By: apstndb <[email protected]>

Add the doc about error recovering

3fc924f

Rename ErrorList to MultiError and improve error messages

085b0cd

Shorten error message more

3954dc9

makenowjust force-pushed the feature/bad-node branch from c162227 to 3954dc9 Compare December 21, 2024 00:39

makenowjust merged commit 6f345ac into main Dec 21, 2024
4 checks passed

makenowjust deleted the feature/bad-node branch December 21, 2024 00:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce `BadNode` and `MultiError` #230

Introduce `BadNode` and `MultiError` #230

makenowjust commented Dec 18, 2024 •

edited

Loading

apstndb Dec 18, 2024 •

edited

Loading

makenowjust Dec 18, 2024

makenowjust Dec 18, 2024

apstndb Dec 18, 2024 •

edited

Loading

apstndb commented Dec 20, 2024

apstndb commented Dec 20, 2024 •

edited

Loading

Introduce BadNode and MultiError #230

Introduce BadNode and MultiError #230

Conversation

makenowjust commented Dec 18, 2024 • edited Loading

apstndb Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

makenowjust Dec 18, 2024

Choose a reason for hiding this comment

makenowjust Dec 18, 2024

Choose a reason for hiding this comment

apstndb Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

apstndb commented Dec 20, 2024

apstndb commented Dec 20, 2024 • edited Loading

Introduce `BadNode` and `MultiError` #230

Introduce `BadNode` and `MultiError` #230

makenowjust commented Dec 18, 2024 •

edited

Loading

apstndb Dec 18, 2024 •

edited

Loading

apstndb Dec 18, 2024 •

edited

Loading

apstndb commented Dec 20, 2024 •

edited

Loading