Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Javascript: Implement ExceptionCollectorListener and make it default behaviour. #38

Merged
merged 4 commits into from
Jul 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
192 changes: 181 additions & 11 deletions cratedb_sqlparse_js/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,12 @@
![NPM Unpacked Size](https://img.shields.io/npm/unpacked-size/@cratedb/cratedb-sqlparse)
![NPM Type Definitions](https://img.shields.io/npm/types/@cratedb/cratedb-sqlparse)


CrateDB SQL Parser for JavaScript, compiled from antlr4 JavaScript compile target.

### Simple usage

```javascript
import { sqlparse } from "@cratedb/cratedb-sqlparse";
import {sqlparse} from "@cratedb/cratedb-sqlparse";

const query = `
SELECT * FROM SYS.SHARDS;
Expand All @@ -37,27 +37,197 @@ console.log(queries[0].original_query)
```

### CrateDB version

You can programmatically check the CrateDB version the package was compiled for in `index.js`

```javascript
import { __cratedb_version__ } from "@cratedb/cratedb-sqlparse";
import {__cratedb_version__} from "@cratedb/cratedb-sqlparse";

console.log(__cratedb_version__)
// 5.6.4
// 5.7.2
```

### Features
Currently, we support the same features as CrateDB java's parser:

Currently, the parser supports a subset of the features of CrateDB's Java/ANTLR parser:

- First class CrateDB SQL dialect support.
- Input is case-insensitive.
- Native errors as exceptions.
- Native errors as exceptions or as objects.
- Dollar strings.
- Tables
- Properties and parametrized properties.

### Exceptions and errors.
surister marked this conversation as resolved.
Show resolved Hide resolved

By default, exceptions are stored in `statement.exception`.

Optional features:
```javascript
import {sqlparse} from "@cratedb/cratedb-sqlparse";

### Errors
Errors are thrown as 'ParseError' e.g.
const query = `
SELECT COUNT(*) FROM doc.tbl f HERE f.id = 1;

```text
ParseError: line2:9 mismatched input 'ROM' expecting {<EOF>, ';'}
INSERT INTO doc.tbl VALUES (1, 23, 4);
`
const statements = sqlparse(query)
const stmt = statements[0]

if (stmt.exception) {
console.log(stmt.exception.errorMessage)
// [line 2:43 mismatched input 'HERE' expecting {<EOF>, ';'}]

console.log(stmt.exception.errorMessageVerbose)
// SELECT COUNT(*) FROM doc.tbl f HERE f.id = 1;
// ^^^^
// INSERT INTO doc.tbl VALUES (1, 23, 4);
}

console.log(stmt.exception)

// ParseError: mismatched input 'HERE' expecting {<EOF>, ';'}
// at ExceptionCollectorListener.syntaxError (file:///home/surister/PycharmProjects/cratedb-sqlparse/cratedb_sqlparse_js/cratedb_sqlparse/parser.js:115:23)
// at file:///home/surister/PycharmProjects/cratedb-sqlparse/cratedb_sqlparse_js/node_modules/antlr4/dist/antlr4.node.mjs:1:42125
// at Array.map (<anonymous>)
// at wt.syntaxError (file:///home/surister/PycharmProjects/cratedb-sqlparse/cratedb_sqlparse_js/node_modules/antlr4/dist/antlr4.node.mjs:1:42115)
// at SqlBaseParser.notifyErrorListeners (file:///home/surister/PycharmProjects/cratedb-sqlparse/cratedb_sqlparse_js/node_modules/antlr4/dist/antlr4.node.mjs:1:102085)
// at Ce.reportInputMismatch (file:///home/surister/PycharmProjects/cratedb-sqlparse/cratedb_sqlparse_js/node_modules/antlr4/dist/antlr4.node.mjs:1:90577)
// at Ce.reportError (file:///home/surister/PycharmProjects/cratedb-sqlparse/cratedb_sqlparse_js/node_modules/antlr4/dist/antlr4.node.mjs:1:88813)
// at SqlBaseParser.statements (file:///home/surister/PycharmProjects/cratedb-sqlparse/cratedb_sqlparse_js/cratedb_sqlparse/generated_parser/SqlBaseParser.js:1345:28)
// at sqlparse (file:///home/surister/PycharmProjects/cratedb-sqlparse/cratedb_sqlparse_js/cratedb_sqlparse/parser.js:207:25)
// at file:///home/surister/PycharmProjects/cratedb-sqlparse/cratedb_sqlparse_js/t.js:4:14 {
// query: 'SELECT COUNT(*) FROM doc.tbl f HERE',
// msg: "mismatched input 'HERE' expecting {<EOF>, ';'}",
// offendingToken: bt {
// source: [ [SqlBaseLexer], [CaseInsensitiveStream] ],
// type: 322,
// channel: 0,
// start: 32,
// stop: 35,
// tokenIndex: 16,
// line: 2,
// column: 31,
// _text: null
// },
// line: 2,
// column: 31,
// errorMessage: "[line 2:31 mismatched input 'HERE' expecting {<EOF>, ';'}]",
// errorMessageVerbose: '\n' +
// 'SELECT COUNT(*) FROM doc.tbl f HERE f.id = 1;\n' +
// ' ^^^^\n' +
// '\n' +
// 'INSERT INTO doc.tbl VALUES (1, 23, 4);\n'
// }
```

In some situations, you might want sqlparse to throw an error.

You can set `raise_exception` to `true`

```javascript
import {sqlparse} from "@cratedb/cratedb-sqlparse";

let stmt = sqlparse('SELECT COUNT(*) FROM doc.tbl f WHERE .id = 1;', true);

// throw new ParseError(
// ^
//
// ParseError: no viable alternative at input 'SELECT COUNT(*) FROM doc.tbl f WHERE .'
```

Catch the exception:

```javascript
import {sqlparse} from "@cratedb/cratedb-sqlparse";

try {
sqlparse('SELECT COUNT(*) FROM doc.tbl f WHERE .id = 1;', true)
} catch (e) {
console.log(e)
}
```

> [!NOTE]
> It will only raise the first exception it finds, even if you pass in several statements.

### Query metadata

Query metadata can be read with `statement.metadata`

```javascript
import {sqlparse} from "@cratedb/cratedb-sqlparse";

const stmt = sqlparse("SELECT A, B FROM doc.tbl12")[0]

console.log(stmt.metadata);

// Metadata {
// tables: [ Table { name: 'tbl12', schema: 'doc' } ],
// parameterizedProperties: {},
// withProperties: {}
// }

```

#### Query properties

Properties defined within a `WITH` statement, `statement.metadata.withProperties:`.

```javascript
import {sqlparse} from "@cratedb/cratedb-sqlparse";


const stmt = sqlparse(`
CREATE TABLE doc.tbl12 (A TEXT) WITH (
"allocation.max_retries" = 5,
"blocks.metadata" = false
);
`)[0]

console.log(stmt.metadata);

// Metadata {
// tables: [ Table { name: 'tbl12', schema: 'doc' } ],
// parameterizedProperties: {},
// withProperties: { 'allocation.max_retries': '5', 'blocks.metadata': 'false' }
// }
```

#### Table name
```javascript
console.log(stmt.metadata.tables)
// [ Table { name: 'tbl12', schema: 'doc' } ]

table = stmt.metadata.tables[0]

console.log(table.schema, table.name, table.fqn)
// doc tbl12 "doc"."tbl12"
```

#### Parameterized properties

Parameterized properties are properties without a real defined value, marked with a dollar string, `metadata.parameterized_properties`

```javascript
import {sqlparse} from "@cratedb/cratedb-sqlparse";

const stmt = sqlparse(`
CREATE TABLE doc.tbl12 (A TEXT) WITH (
"allocation.max_retries" = 5,
"blocks.metadata" = $1
);
`)[0]

console.log(stmt.metadata)

// Metadata {
// tables: [ Table { name: 'tbl12', schema: 'doc', fqn: '"doc"."tbl12"' } ],
// parameterizedProperties: { 'blocks.metadata': '$1' },
// withProperties: { 'allocation.max_retries': '5', 'blocks.metadata': '$1' }
// }
```

In this case, `blocks.metadata` will be in `with_properties` and `parameterized_properties` as well.

For values to be picked up they need to start with a dollar `'$'` and be preceded by integers, e.g. `'$1'` or `'$123'`.
`'$123abc'` would not be valid.
98 changes: 98 additions & 0 deletions cratedb_sqlparse_js/cratedb_sqlparse/AstBuilder.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
import SqlBaseParserVisitor from "./generated_parser/SqlBaseParserVisitor.js";
import SqlBaseParser from "./generated_parser/SqlBaseParser.js";
import {Statement} from "./parser.js"
import {Table} from "./models.js"


/**
*
* @param {string} text
* @returns {Boolean}
*/
function isDigit(text) {
return text.split('').every(char => char >= '0' && char <= '9');
}


export class AstBuilder extends SqlBaseParserVisitor {
// The class implements the antlr4 visitor pattern similar to how we do it in CrateDB
// https://github.com/crate/crate/blob/master/libs/sql-parser/src/main/java/io/crate/sql/parser/AstBuilder.java
//
// The biggest difference is that in CrateDB, `AstBuilder`, visitor methods
// return a specialized Statement visitor.
//
// Sqlparse just extracts whatever data it needs from the context and injects it to the current
// visited statement, enriching its metadata.

/**
*
* @param {Object} node
* @returns {(string|null)}
*/
getText(node) {
if (node) {
return node.getText().replaceAll("'", "").replaceAll('"', "")
}
return null
}

/**
*
* @param {Statement} stmt
*/
enrich(stmt) {
this.stmt = stmt
this.visit(this.stmt.ctx)
}

/**
*
* @param {SqlBaseParser.TableNameContext} ctx
*/
visitTableName(ctx) {
const fqn = ctx.qname()
const parts = this.getText(fqn).split(".")

let schema = null
let name = null;
if (parts.length === 1) {
name = parts[0]
} else {
schema = parts[0]
name = parts[1]
}

this.stmt.metadata.tables.push(
new Table(name, schema)
)
}

/**
*
* @param {SqlBaseParser.GenericPropertiesContext} ctx
*/
visitGenericProperties(ctx) {
const nodeProperties = ctx.genericProperty()
const properties = {}
const parameterizedProperties = {}

for (const property of nodeProperties) {
let key = this.getText(property.ident())
let value = this.getText(property.expr())

properties[key] = value

if (value && value[0] === '$') {
// It might be a parameterized value, e.g. '$1'
if (isDigit(value.slice(1))) {
parameterizedProperties[key] = value
}
}

this.stmt.metadata.withProperties = properties
this.stmt.metadata.parameterizedProperties = parameterizedProperties

}
}

}
Loading