Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add related source code locations to errors #13664

Merged
merged 39 commits into from
Feb 1, 2025
Merged
Changes from 1 commit
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
13ad9db
Reset branch to no changes other than SqlRel
eliaperantoni Jan 16, 2025
4603b35
feat: improve `Diagnostic` ergonomics
eliaperantoni Jan 16, 2025
86e1722
feat: 'table not found' diagnostic in `SqlToRel`
eliaperantoni Jan 16, 2025
c3532c1
feat: unresolved fields
eliaperantoni Jan 17, 2025
91f4e3c
feat: union with different number of columns
eliaperantoni Jan 17, 2025
85db8af
feat: `Spans`
eliaperantoni Jan 23, 2025
3f4ddd2
feat: adjust field not found error
eliaperantoni Jan 23, 2025
b4df859
feat: incompatible types in binary expr
eliaperantoni Jan 23, 2025
e4652d7
fix: tests not compiling
eliaperantoni Jan 23, 2025
d2e70ad
feat: ambiguous column
eliaperantoni Jan 23, 2025
e44cb9f
feat: possible references
eliaperantoni Jan 23, 2025
31772ed
feat: group by
eliaperantoni Jan 23, 2025
c15d8f5
chore: fmt
eliaperantoni Jan 23, 2025
09d7065
feat: relay `Spans` in `create_col_from_scalar_expr`
eliaperantoni Jan 23, 2025
1d5c647
chore: cargo check
eliaperantoni Jan 23, 2025
5fdc10e
chore: cargo fmt
eliaperantoni Jan 23, 2025
f63a493
Merge branch 'main' into eper/diagnostics
eliaperantoni Jan 23, 2025
6f6722d
feat: diagnostic can only be either error or warning
eliaperantoni Jan 24, 2025
9a8ef78
feat: simplify to one `Diagnostic` per error
eliaperantoni Jan 24, 2025
fd46dce
test: add tests
eliaperantoni Jan 24, 2025
360ac20
chore: add license headers
eliaperantoni Jan 24, 2025
3dd66fc
chore: update CLI lockfile
eliaperantoni Jan 24, 2025
13c4c9d
chore: taplo format
eliaperantoni Jan 24, 2025
fe74dce
fix: doctest
eliaperantoni Jan 24, 2025
829430a
chore: configs.md
eliaperantoni Jan 24, 2025
2772c6b
fix: test
eliaperantoni Jan 24, 2025
1b35207
fix: clippy, by removing smallvec
eliaperantoni Jan 24, 2025
243b788
fix: update slt
eliaperantoni Jan 24, 2025
e98694b
Merge branch 'main' into eper/diagnostics
eliaperantoni Jan 24, 2025
e847477
feat: support all binary expressions
eliaperantoni Jan 24, 2025
60a9f8b
refactor: move diagnostic tests to datafusion-sql
eliaperantoni Jan 27, 2025
f178db0
chore: format
eliaperantoni Jan 27, 2025
1308b85
Merge branch 'main' into eper/diagnostics
eliaperantoni Jan 27, 2025
51dd141
fix: tests
eliaperantoni Jan 27, 2025
b5e326c
feat: pr feedback
eliaperantoni Jan 28, 2025
1cd0f3e
fix: ci checks
eliaperantoni Jan 28, 2025
20e9f61
Merge remote-tracking branch 'apache/main' into eper/diagnostics
alamb Jan 29, 2025
e1e6ac5
Merge remote-tracking branch 'apache/main' into eper/diagnostics
alamb Feb 1, 2025
7265bd2
Merge remote-tracking branch 'apache/main' into eper/diagnostics
alamb Feb 1, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix: update slt
eliaperantoni committed Jan 24, 2025
commit 243b7883c8f25e14e43cda77c5f2a556c2efa44a
14 changes: 7 additions & 7 deletions datafusion/sqllogictest/test_files/create_external_table.slt
Original file line number Diff line number Diff line change
@@ -33,23 +33,23 @@ statement error DataFusion error: SQL error: ParserError\("Missing LOCATION clau
CREATE EXTERNAL TABLE t STORED AS CSV

# Option value is missing
statement error DataFusion error: SQL error: ParserError\("Expected: string or numeric value, found: \)"\)
statement error DataFusion error: SQL error: ParserError\("Expected: string or numeric value, found: \) at Line: 1, Column: 66"\)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats really cool, does that work for multiline statements?

Copy link
Contributor

@comphead comphead Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking should be that parser responsibility rather than the runtime engine's 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it does work for multiline statements, and it's the parser that outputs that error.

WITH cte1 AS (
    SELECT 1 AS id
), cte2 AS (
    SELECT 1 AS id
),
SELECT id FROM cte1, cte2, users
-- SQL error: ParserError("Expected: AS, found: id at Line: 6, Column: 8")

CREATE EXTERNAL TABLE t STORED AS x OPTIONS ('k1' 'v1', k2 v2, k3) LOCATION 'blahblah'

# Missing `(` in WITH ORDER clause
statement error DataFusion error: SQL error: ParserError\("Expected: \(, found: c1"\)
statement error DataFusion error: SQL error: ParserError\("Expected: \(, found: c1 at Line: 1, Column: 58"\)
CREATE EXTERNAL TABLE t(c1 int) STORED AS CSV WITH ORDER c1 LOCATION 'foo.csv'

# Missing `)` in WITH ORDER clause
statement error DataFusion error: SQL error: ParserError\("Expected: \), found: LOCATION"\)
statement error DataFusion error: SQL error: ParserError\("Expected: \), found: LOCATION at Line: 1, Column: 62"\)
CREATE EXTERNAL TABLE t(c1 int) STORED AS CSV WITH ORDER (c1 LOCATION 'foo.csv'

# Missing `ROW` in WITH HEADER clause
statement error DataFusion error: SQL error: ParserError\("Expected: ROW, found: LOCATION"\)
statement error DataFusion error: SQL error: ParserError\("Expected: ROW, found: LOCATION at Line: 1, Column: 51"\)
CREATE EXTERNAL TABLE t STORED AS CSV WITH HEADER LOCATION 'abc'

# Missing `BY` in PARTITIONED clause
statement error DataFusion error: SQL error: ParserError\("Expected: BY, found: LOCATION"\)
statement error DataFusion error: SQL error: ParserError\("Expected: BY, found: LOCATION at Line: 1, Column: 51"\)
CREATE EXTERNAL TABLE t STORED AS CSV PARTITIONED LOCATION 'abc'

# Duplicate `STORED AS` clause
@@ -69,11 +69,11 @@ statement error DataFusion error: SQL error: ParserError\("OPTIONS specified mor
CREATE EXTERNAL TABLE t STORED AS CSV OPTIONS ('k1' 'v1', 'k2' 'v2') OPTIONS ('k3' 'v3') LOCATION 'foo.csv'

# With typo error
statement error DataFusion error: SQL error: ParserError\("Expected: HEADER, found: HEAD"\)
statement error DataFusion error: SQL error: ParserError\("Expected: HEADER, found: HEAD at Line: 1, Column: 52"\)
CREATE EXTERNAL TABLE t(c1 int) STORED AS CSV WITH HEAD ROW LOCATION 'foo.csv';

# Missing `anything` in WITH clause
statement error DataFusion error: SQL error: ParserError\("Expected: HEADER, found: LOCATION"\)
statement error DataFusion error: SQL error: ParserError\("Expected: HEADER, found: LOCATION at Line: 1, Column: 52"\)
CREATE EXTERNAL TABLE t(c1 int) STORED AS CSV WITH LOCATION 'foo.csv';

# Unrecognized random clause
2 changes: 1 addition & 1 deletion datafusion/sqllogictest/test_files/distinct_on.slt
Original file line number Diff line number Diff line change
@@ -153,7 +153,7 @@ b 1 29 -18218 994303988 5983957848665088916 204 9489 3275293996 1485709125918647
c 2 1 18109 2033001162 -6513304855495910254 25 43062 1491205016 5863949479783605708 0.110830784 0.929409733247 6WfVFBVGJSQb7FhA7E0lBwdvjfZnSW

# can't distinct on *
query error DataFusion error: SQL error: ParserError\("Expected: an expression, found: \*"\)
query error DataFusion error: SQL error: ParserError\("Expected: an expression, found: \* at Line: 1, Column: 21"\)
SELECT DISTINCT ON (*) c1 FROM aggregate_test_100 ORDER BY c1 LIMIT 3;


2 changes: 2 additions & 0 deletions datafusion/sqllogictest/test_files/information_schema.slt
Original file line number Diff line number Diff line change
@@ -257,6 +257,7 @@ datafusion.optimizer.repartition_sorts true
datafusion.optimizer.repartition_windows true
datafusion.optimizer.skip_failed_rules false
datafusion.optimizer.top_down_join_key_reordering true
datafusion.sql_parser.collect_spans false
datafusion.sql_parser.dialect generic
datafusion.sql_parser.enable_ident_normalization true
datafusion.sql_parser.enable_options_value_normalization false
@@ -351,6 +352,7 @@ datafusion.optimizer.repartition_sorts true Should DataFusion execute sorts in a
datafusion.optimizer.repartition_windows true Should DataFusion repartition data using the partitions keys to execute window functions in parallel using the provided `target_partitions` level
datafusion.optimizer.skip_failed_rules false When set to true, the logical plan optimizer will produce warning messages if any optimization rules produce errors and then proceed to the next rule. When set to false, any rules that produce errors will cause the query to fail
datafusion.optimizer.top_down_join_key_reordering true When set to true, the physical plan optimizer will run a top down process to reorder the join keys
datafusion.sql_parser.collect_spans false When set to true, the source locations relative to the original SQL query (i.e. [`Span`](sqlparser::tokenizer::Span)) will be collected and recorded in the logical plan nodes.
datafusion.sql_parser.dialect generic Configure the SQL dialect used by DataFusion's parser; supported values include: Generic, MySQL, PostgreSQL, Hive, SQLite, Snowflake, Redshift, MsSQL, ClickHouse, BigQuery, and Ansi.
datafusion.sql_parser.enable_ident_normalization true When set to true, SQL parser will normalize ident (convert ident to lowercase when not quoted)
datafusion.sql_parser.enable_options_value_normalization false When set to true, SQL parser will normalize options value (convert value to lowercase). Note that this option is ignored and will be removed in the future. All case-insensitive values are normalized automatically.
2 changes: 1 addition & 1 deletion datafusion/sqllogictest/test_files/insert.slt
Original file line number Diff line number Diff line change
@@ -431,5 +431,5 @@ NULL 20 500 default_text
statement ok
drop table test_column_defaults

statement error DataFusion error: Error during planning: Column reference is not allowed in the DEFAULT expression : Schema error: No field named a.
statement error DataFusion error: Schema error: No field named a\.
create table test_column_defaults(a int, b int default a+1)
2 changes: 1 addition & 1 deletion datafusion/sqllogictest/test_files/insert_to_external.slt
Original file line number Diff line number Diff line change
@@ -620,7 +620,7 @@ statement ok
drop table test_column_defaults

# test invalid default value
statement error DataFusion error: Error during planning: Column reference is not allowed in the DEFAULT expression : Schema error: No field named a.
statement error DataFusion error: Schema error: No field named a\.
CREATE EXTERNAL TABLE test_column_defaults(
a int,
b int default a+1
8 changes: 4 additions & 4 deletions datafusion/sqllogictest/test_files/joins.slt
Original file line number Diff line number Diff line change
@@ -4064,12 +4064,12 @@ logical_plan
09)------------Unnest: lists[__unnest_placeholder(generate_series(Int64(1),outer_ref(t1.t1_int)))|depth=1] structs[]
10)--------------Projection: generate_series(Int64(1), CAST(outer_ref(t1.t1_int) AS Int64)) AS __unnest_placeholder(generate_series(Int64(1),outer_ref(t1.t1_int)))
11)----------------EmptyRelation
physical_plan_error This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn(UInt32, Column { relation: Some(Bare { table: "t1" }), name: "t1_int" })
physical_plan_error This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn(UInt32, Column { relation: Some(Bare { table: "t1" }), name: "t1_int", spans: Spans([]) })


# Test CROSS JOIN LATERAL syntax (execution)
# TODO: https://github.com/apache/datafusion/issues/10048
query error DataFusion error: This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn\(UInt32, Column \{ relation: Some\(Bare \{ table: "t1" \}\), name: "t1_int" \}\)
query error DataFusion error: This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn\(UInt32, Column \{ relation: Some\(Bare \{ table: "t1" \}\), name: "t1_int", spans: Spans\(\[\]\) \}\)
select t1_id, t1_name, i from join_t1 t1 cross join lateral (select * from unnest(generate_series(1, t1_int))) as series(i);


@@ -4089,12 +4089,12 @@ logical_plan
09)------------Unnest: lists[__unnest_placeholder(generate_series(Int64(1),outer_ref(t2.t1_int)))|depth=1] structs[]
10)--------------Projection: generate_series(Int64(1), CAST(outer_ref(t2.t1_int) AS Int64)) AS __unnest_placeholder(generate_series(Int64(1),outer_ref(t2.t1_int)))
11)----------------EmptyRelation
physical_plan_error This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn(UInt32, Column { relation: Some(Bare { table: "t2" }), name: "t1_int" })
physical_plan_error This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn(UInt32, Column { relation: Some(Bare { table: "t2" }), name: "t1_int", spans: Spans([]) })


# Test INNER JOIN LATERAL syntax (execution)
# TODO: https://github.com/apache/datafusion/issues/10048
query error DataFusion error: This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn\(UInt32, Column \{ relation: Some\(Bare \{ table: "t2" \}\), name: "t1_int" \}\)
query error DataFusion error: This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn\(UInt32, Column \{ relation: Some\(Bare \{ table: "t2" \}\), name: "t1_int", spans: Spans\(\[\]\) \}\)
select t1_id, t1_name, i from join_t1 t2 inner join lateral (select * from unnest(generate_series(1, t1_int))) as series(i) on(t1_id > i);

# Test RIGHT JOIN LATERAL syntax (unsupported)
2 changes: 1 addition & 1 deletion datafusion/sqllogictest/test_files/predicates.slt
Original file line number Diff line number Diff line change
@@ -584,7 +584,7 @@ DROP TABLE data_index_bloom_encoding_stats;
# String coercion
########

statement error DataFusion error: SQL error: ParserError\("Expected: a data type name, found: ,"\)
statement error DataFusion error: SQL error: ParserError\("Expected: a data type name, found: , at Line: 1, Column: 30"\)
CREATE TABLE t(vendor_id_utf8, vendor_id_dict)
AS VALUES
(arrow_cast('124', 'Utf8'), arrow_cast('124', 'Dictionary(Int16, Utf8)')),
4 changes: 2 additions & 2 deletions datafusion/sqllogictest/test_files/select.slt
Original file line number Diff line number Diff line change
@@ -339,10 +339,10 @@ NULL 1
statement error DataFusion error: SQL error: ParserError\("Expected: \(, found: EOF"\)
VALUES

statement error DataFusion error: SQL error: ParserError\("Expected: an expression, found: \)"\)
statement error DataFusion error: SQL error: ParserError\("Expected: an expression, found: \) at Line: 1, Column: 9"\)
VALUES ()

statement error DataFusion error: SQL error: ParserError\("Expected: an expression, found: \)"\)
statement error DataFusion error: SQL error: ParserError\("Expected: an expression, found: \) at Line: 1, Column: 13"\)
VALUES (1),()

statement error DataFusion error: Error during planning: Inconsistent data length across values list: got 2 values in row 1 but expected 1
4 changes: 2 additions & 2 deletions datafusion/sqllogictest/test_files/type_coercion.slt
Original file line number Diff line number Diff line change
@@ -103,11 +103,11 @@ CREATE TABLE orders(
);

# union_different_num_columns_error() / UNION
query error DataFusion error: Error during planning: UNION queries have different number of columns: left has 1 columns whereas right has 2 columns
query error DataFusion error: Error during planning: UNION queries have different number of columns
SELECT order_id FROM orders UNION SELECT customer_id, o_item_id FROM orders

# union_different_num_columns_error() / UNION ALL
query error DataFusion error: Error during planning: UNION queries have different number of columns: left has 1 columns whereas right has 2 columns
query error DataFusion error: Error during planning: UNION queries have different number of columns
SELECT order_id FROM orders UNION ALL SELECT customer_id, o_item_id FROM orders

# union_with_different_column_names()
10 changes: 5 additions & 5 deletions datafusion/sqllogictest/test_files/unnest.slt
Original file line number Diff line number Diff line change
@@ -295,7 +295,7 @@ query error DataFusion error: Error during planning: unnest\(\) requires exactly
select unnest();

## Unnest empty expression in from clause
query error DataFusion error: SQL error: ParserError\("Expected: an expression, found: \)"\)
query error DataFusion error: SQL error: ParserError\("Expected: an expression, found: \) at Line: 1, Column: 22"\)
select * from unnest();


@@ -863,11 +863,11 @@ select count(*) from (select unnest(range(0, 100000)) id) t inner join (select u
# Test implicit LATERAL support for UNNEST
# Issue: https://github.com/apache/datafusion/issues/13659
# TODO: https://github.com/apache/datafusion/issues/10048
query error DataFusion error: This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn\(List\(Field \{ name: "item", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: \{\} \}\), Column \{ relation: Some\(Bare \{ table: "u" \}\), name: "column1" \}\)
query error DataFusion error: This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn\(List\(Field \{ name: "item", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: \{\} \}\), Column \{ relation: Some\(Bare \{ table: "u" \}\), name: "column1", spans: Spans\(\[\]\) \}\)
select * from unnest_table u, unnest(u.column1);

# Test implicit LATERAL support for UNNEST (INNER JOIN)
query error DataFusion error: This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn\(List\(Field \{ name: "item", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: \{\} \}\), Column \{ relation: Some\(Bare \{ table: "u" \}\), name: "column1" \}\)
query error DataFusion error: This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn\(List\(Field \{ name: "item", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: \{\} \}\), Column \{ relation: Some\(Bare \{ table: "u" \}\), name: "column1", spans: Spans\(\[\]\) \}\)
select * from unnest_table u INNER JOIN unnest(u.column1) AS t(column1) ON u.column3 = t.column1;

# Test implicit LATERAL planning for UNNEST
@@ -883,7 +883,7 @@ logical_plan
06)------Unnest: lists[__unnest_placeholder(outer_ref(u.column1))|depth=1] structs[]
07)--------Projection: outer_ref(u.column1) AS __unnest_placeholder(outer_ref(u.column1))
08)----------EmptyRelation
physical_plan_error This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn(List(Field { name: "item", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), Column { relation: Some(Bare { table: "u" }), name: "column1" })
physical_plan_error This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn(List(Field { name: "item", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), Column { relation: Some(Bare { table: "u" }), name: "column1", spans: Spans([]) })

# Test implicit LATERAL planning for UNNEST (INNER JOIN)
query TT
@@ -899,7 +899,7 @@ logical_plan
07)--------Unnest: lists[__unnest_placeholder(outer_ref(u.column1))|depth=1] structs[]
08)----------Projection: outer_ref(u.column1) AS __unnest_placeholder(outer_ref(u.column1))
09)------------EmptyRelation
physical_plan_error This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn(List(Field { name: "item", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), Column { relation: Some(Bare { table: "u" }), name: "column1" })
physical_plan_error This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn(List(Field { name: "item", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), Column { relation: Some(Bare { table: "u" }), name: "column1", spans: Spans([]) })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if Spans brings a purpose here? Should we exclude it from error message, otherwise it might be confusing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah in general I think I agree that we don't need Column::Debug to print the spans. I'll change it.



## Unnest in subquery