Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add related source code locations to errors #13664

Open
wants to merge 37 commits into
base: main
Choose a base branch
from

Conversation

eliaperantoni
Copy link

After the PoC that I did in https://github.com/eliaperantoni/datafusion/tree/eper/inline-diagnostics, in this PR I'm attempting to build a more robust solution, with the goal of eventually merging it.

Still, so far I've only added diagnostics to the Cannot coerce arithmetic expression .. to valid types error to get feedback and approval on the general implementation. That way I can make tweaks quickly, before I go extend this to the rest of the repo.

Which issue does this PR close?

Closes #13662.

What changes are included in this PR?

I introduce the Diagnostic, DiagnosticEntry, and DiagnosticEntryKind types to enrich errors with better user-facing information, mainly by referring to the portions of the SQL code that caused it.

A DiagnosticEntry is used to refer to one particular portion of SQL code and describe its involvement in the error with a string message and a kind. A Diagnostic is simply a collection of DiagnosticEntry.

For example, in SELECT a_string + an_integer we'd have:

  • Entry kind=Error message=Incompatible types in binary expression Wrapping the whole a_string + an_integer expression.
  • Entry kind=Note message=Is of type Utf8 Wrapping just the left side.
  • Entry kind=Note message=Is of type Int64 Wrapping just the right side.

A new DataFusionError::Diagnostic variant is used to attach Diagnostic to errors. You can call with_diagnostic on any DataFusionError to do so. They can later be retrieved by calling DataFusionError::get_diagnostics.

It is possible to attach multiple Diagnostic while an error is being returned along the function call stack. The rationale here is that different functions might have different levels of detail, but they might all have useful diagnostics to add. e.g. the function that computes the output type of an expression has no idea about why it's being called, but calling functions have no idea about the intricacies of the error. They all know different things.

I also added a span: Span to Column. This is the first step of a (probably) long process of enriching the types used during logical planning with the Spans from the parsed AST nodes.

Added WithSpan<T>. The idea is that we want to get Span information down to functions called deep in the call stack, but without breaking existing code too much so that we can ensure a smooth and incremental transition to Diagnostic-ed errors. WithSpan<T> was my attempt to do so because any T implements Into<WithSpan<T>> by simply attaching Span::empty (but of course, if a Span is available it should be used instead). This means that any time a function wants to start spawning Diagnostic-ed errors, it can change some of its arguments by wrapping their types in WithSpan<T> and existing calls will keep on working.

WithSpan<T> should be used when the argument type is something that can loosely be traced back to a location in the source code, but is not really part of any AST node, or anything that is isomorphic to it. A good example is DataType which is taken by signature(DataType, Operator, DataType) -> DataType to get the output type of a binary expression. That is the function that spawns the error, and the caller doesn't have enough information to provide a good Diagnostic. But at the same time, it would be (in my opinion) cumbersome to add Span arguments to the function. So instead, the function can wrap the DataTypes for the two sides of the expression in WithSpan<T>. Some call points can be updated to pass a Span-enriched WithSpan<DataType>, and all the others will keep on working without changes required.

Are these changes tested?

Not yet, while this a draft to gather feedback.

Are there any user-facing changes?

Yes. The user can call DataFusionError::get_diagnostics to get an iterator over all the Diagnostic that have been attached to the error chain. Those contain source code locations that relate to the error, and each location comes with a message.

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) common Related to common crate labels Dec 5, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @eliaperantoni -- this is very cool. I left some comments, let me know what you think

@@ -131,6 +132,7 @@ pub enum DataFusionError {
/// Errors from either mapping LogicalPlans to/from Substrait plans
/// or serializing/deserializing protobytes to Substrait plans
Substrait(String),
Diagnostic(Diagnostic, Box<DataFusionError>),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting idea -- one way to think about this is that it adds additional structured information to DataFusionError::Context

If we went with the DataFusion::Diagnostic approach, do you think we would be able to deprecate / remove DataFusionError::Context in a future release?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Yes, I think DataFusion::Diagnostic can convey a superset of the information that DataFusion::Context can. Any wrapping such as:

let error = DataFusionError::Context(message, Box::new(error));

can be converted to:

let error = DataFusionError::Diagnostic(
	Diagnostic {
		entries: vec![
			DiagnosticEntry {
				span: Span::empty(),
				kind: DiagnosticEntryKind::Error,
				message,
			}
		]
	},
	Box::new(error)
);

And of course, we can provide a DataFusionError::with_simple_diagnostic function to avoid the boilerplate. At that point, DataFusion::Context could be removed.

This also enables progressively adding Span information to what was previously simply a string message.

pub struct Column {
/// relation/table reference.
pub relation: Option<TableReference>,
/// field/column name.
pub name: String,
#[derivative(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an interesting pattern -- it is different than the way sqlparser did it (which is to effectively wrap the span with a struct that is ignored when comparing eq, hash, etc

Did you consider a similar approach?

pub struct Column {
...
  // Wrapped span that does not contribute to PartialEq, Eq, Hash, etc
  pub span: DiagnosticOnly<Span>
}

I think this approach is less verbose and might be easier to understand for the casual reader

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also see WithSpan has a similar approach (but that is for wrapping things with Spans 🤔 )

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is indeed interesting. 😅 I was going to use derivative for this problem in sqlparser initially actually, but I wasn't sure if you would want to add a new dependency. If that is not a concern, I think using derivative for cases like this is pretty nice as it is also clear at the usage site that it is being ignored, unlike AttachedToken which kind of hides the behavior. If a wrapper type is preferred, I now think actually maybe the struct in sqlparser should have been called DiagnosticOnly as well 😂

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the AttachedToken approach that we took in sqlparser could be a functionally equivalent alternative. And I understand that not everybody might be familiar with derivative.

What I don't 100% love about it is that the PartialEq implementation on AttachedToken is a bit "impure". In that, it serves the one specific purpose of making it ignored when used in a struct field, but prevents you from actually comparing two instances of it because they would always are equal (the implementation returns always true), which is something you might want to do at some point. Especially because the name AttachedToken doesn't clearly convey that intent, and does more than just being a passthrough for PartialEq since its main purpose is tying together a Span and a Token.

When looking at a struct that uses it:

#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)]
pub struct Select {
    /// Token for the `SELECT` keyword
    pub select_token: AttachedToken,
	// ...
}

it's not clear to me that AttachedToken is being ignored. I only realise that when I go look at the implementation of PartialEq.

Overall, the derivative approach to me more clearly lets me keep the sensible implementation of PartialEq for Span, while also conveying that "when used as a field in this particular struct, I want it to not influence the struct's comparison". But in other structs, I might want it to influence the comparison.

I personally find this approach more readable and flexible, but I'm open to converting to a wrapper type :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also see WithSpan has a similar approach (but that is for wrapping things with Spans 🤔 )

Ah yes good catch! I could definitely have used derivative for WithSpan too. And about WithSpan: do you like the general idea of it?

It's meant to be a "sidecar" to get a Span all the way to where it's needed to create a meaningful diagnostic, by attaching it to values that are sort-of-related, but wouldn't make a lot of sense to add as a struct field.

For example, the two DataTypes in a binary expression come from the two expressions in the SQL query, so they (in some sense) relate to a portion of the code. But putting span as a field of DataType sounds wrong.

Maybe a better name could be SpanRelated or something like that?.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my perspective some design goals should be:

  1. Make it easy for someone reading the DataFusion source code to understand what is going on
  2. If people are not interested in the span / diagnostic information they can ignore it

From my perspective, using new crates does reduce the amount of code in the DataFusion crate, but can often increase cognitive load (as now to understand the DataFusion code you need to understand the crates)

I think the AttachedToken approach that we took in sqlparser could be a functionally equivalent alternative. And I understand that not everybody might be familiar with derivative.

Yeah, that is why I think AttachedToken might be better in this case


use sqlparser::tokenizer::Span;

/// A wrapper type for entitites that somehow refer to a location in the source
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider a trait like was done in sqlparser that retrieves spans? For example

pub trait Spanned {
  /// return the span for this object, if known
  fn span(&self) -> Option<&Span>
}

And then we would implement that trait for various DataFusion / Arrow tyoes (like DataType, etc)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would likely require implementing specific wrapper types like DataTypeWithSpan or something to attach Span information to DataTypes

I also see the need to somehow plumb through the span information into things like coerce-types 🤔

Though maybe now would be a good time to make a proper structure for coercing types, (struct TypeCoercer or something) where we could attach the spans 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider a trait like was done in sqlparser that retrieves spans?

Yes, but to implement one such trait I'd have to put the data in DataType. And I think that's a bit pushing the limits of what DataType is meant to do. I quite like the idea of the WithSpan (can rename) functioning like a sidecar, to limit big changes in the code.

Maybe we could have a Spanned trait in datafusion as well, that is automatically implemented for all WithSpan<T>, and also by types that natively carry a Span?

Though maybe now would be a good time to make a proper structure for coercing types, (struct TypeCoercer or something) where we could attach the spans 🤔

Not too familiar about the history of that component. But I think that's just one example, and there's probably many more cases where changes like that would be needed to accommodate the presence of Spans.

@eliaperantoni
Copy link
Author

eliaperantoni commented Dec 9, 2024

@alamb here's a few examples of what the new diagnostics that I implemented so far can do:

image

image

What are you thoughts so far?

Note that the rendering here is not part of my PR, but it consumes the Diagnostic data that is now produced by datafusion.

We think that this could be quite nicely integrated with datafusion-dft and datafusion-clito provide a richer experience for end users.

Perhaps this kind of quality of life features can also be related to #13525, as it would make it easier for consumers of Datafusion to provide a nicer experience to their end users.

@github-actions github-actions bot added the proto Related to proto crate label Dec 11, 2024
@alamb
Copy link
Contributor

alamb commented Dec 12, 2024

Sorry for my radio silence here -- I am reviewing this now. Thank you for your patience

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry again for the delay

I spent some more time reviewing this PR, but did not get through it entirely

I think one of the largest design decisions is "where to store the spans"?

There is a tension between trying to keep full source provenance information and overburdening the existing code. I am musing here, but I am hoping to find some balance between

  1. Getting something useful in to DataFusion that we can incrementally update
  2. Ensuring we have a good long term roadmap.

The basic idea of attaching the spans to existing LogicalPlan and Exprs (which is what this PR seems to do) makes sense to me in theory, but in practice it feels like it may be hugely disruptive (many changes to existing structs) and the information propagation will likely be to hard to maintain and test -- all code that transforms/rewrites would have to propagate the spans

I also added a span: Span to Column. This is the first step of a (probably) long process of enriching the types used during logical planning with the Spans from the parsed AST nodes.

What do you think about initially focusing on locations from the errors out of the sql planner (SqlToRel) and postpone adding Spans to the plan nodes?

This would allow us to set up a good pattern for attaching Diagnostics to errors and reportin them. Once that infrastructure is solidified we can start figuring out a pattern to plumb the source information into other areas (like type coercions errors)?

pub struct Column {
/// relation/table reference.
pub relation: Option<TableReference>,
/// field/column name.
pub name: String,
#[derivative(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my perspective some design goals should be:

  1. Make it easy for someone reading the DataFusion source code to understand what is going on
  2. If people are not interested in the span / diagnostic information they can ignore it

From my perspective, using new crates does reduce the amount of code in the DataFusion crate, but can often increase cognitive load (as now to understand the DataFusion code you need to understand the crates)

I think the AttachedToken approach that we took in sqlparser could be a functionally equivalent alternative. And I understand that not everybody might be familiar with derivative.

Yeah, that is why I think AttachedToken might be better in this case

/// Attaches a [`Span`] to the [`Column`], i.e. its location in the source
/// SQL query.
pub fn with_span(mut self, span: Span) -> Self {
self.spans = vec![span];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this perhaps append a span to the Column rather than overwrite it?

@@ -113,6 +117,10 @@ pub struct DFSchema {
field_qualifiers: Vec<Option<TableReference>>,
/// Stores functional dependencies in the schema.
functional_dependencies: FunctionalDependencies,
/// The location in the source code where the fields are defined (e.g. in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems somewhat strange to me as the schema may not be defined in the query

Maybe we can avoid adding this as part of V1

@github-actions github-actions bot removed logical-expr Logical plan and expressions optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) proto Related to proto crate labels Jan 17, 2025
@eliaperantoni
Copy link
Author

@alamb I'm very sorry for the delay. I think your points about not wanting to change the logical types are valid, and I confess it was a bit of a pain to make all the changes to make my previous iteration compile and route the spans through. I tried again, this time using your suggestion of:

What do you think about initially focusing on locations from the errors out of the sql planner (SqlToRel) and postpone adding Spans to the plan nodes?

I think this was a lot easier and pollutes the source code a lot less. So far I've implemented:

  • Table not found
  • Unqualified column not found
  • Qualified column not found
  • Mismatch in number of columns in set operation

I think the "column not found" is the most interesting, since when I get an unresolved datafusion::Column, I now have to walk down a tree of sqlparser::Expr and fine one that produces a matching datafusion::Column.

i.e. since I can't store data in the logical node, when I get a troublesome one I have to transform the AST tree again to figure out which AST node produces the problematic logical node.

That works okay. Performance might not be great but it only happens in case of errors, so I think it might be okay.

I'd like to hear your opinion.

But then I started implementing the "column missing from GROUP BY clause" error and that was a bit more difficult (or unreadable) because there's many more layers of functions in between that which has access the AST tree, and the one that checks a single logical expr in the SELECT part of the query and throws the error. I could of course pass down the AST tree, but need I need a DFSchema and a PlannerContext to normalise the identifiers and correctly go from AST to logical nodes. I find that a bit cumbersome and error prone. I feel like it would be so much easier if we could agree on a way to store Span in the logical Column.

@eliaperantoni eliaperantoni requested a review from alamb January 20, 2025 11:14
self.with_diagnostic(diagnostic)
}

pub fn get_diagnostics(&self) -> impl Iterator<Item = &Diagnostic> + '_ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor -- I think a more consistent API would be to call this

Suggested change
pub fn get_diagnostics(&self) -> impl Iterator<Item = &Diagnostic> + '_ {
pub fn diagnostics(&self) -> impl Iterator<Item = &Diagnostic> + '_ {

@alamb
Copy link
Contributor

alamb commented Jan 20, 2025

I am starting to check this PR out

@alamb
Copy link
Contributor

alamb commented Jan 20, 2025

I am looking at this PR in detail now

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @eliaperantoni and I am really sorry for the delay in reviewing. I will try and keep more focus on this PR.

I think the changes to DataFusion error looks good.

I think I may be missing something with the planner changes, but it seems to me like it would be possible to plumb the "current span" information down using the PlannerContext (I left some more detailed suggestions)

In terms of next steps, what I imagine will happen is once we get the patterns set in this PR, we (aka I will do / help a lot) will write up a high level overview / next steps (e.g. plumb span information more, etc)

let mut expr = self.sql_expr_to_logical_expr(sql, schema, planner_context)?;
// The location of the original SQL expression in the source code
let mut expr =
self.sql_expr_to_logical_expr(sql.clone(), schema, planner_context)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to copy the entire AST just to get the span information on error is non ideal (we are trying to keep planning reasonably faster)

I understand that Expr doesn't have the Span (and adding it is quite potentially intrusive). However, I wonder if you have considered using PlannerContext?

Specifically, since the PlannerContext is already threaded through most/all planning methods, if you could add the "current span" on the PlannerContext, you would have the necessary span information when you generated an error.

#[derive(Debug, Clone)]
pub struct PlannerContext {
...
  /// the current span of the expression or statement being planned
  /// Note not all statements have span information yet
  /// see <https://github.com/apache/datafusion-sqlparser-rs/issues/1548>
  current_span: Option<Span>,
...
}

Then rather than calling sql_expr_to_logical_expr twice, you could have the error generated in sql_expr_to_logical_expr include the span information.

The key would to manage setting/restoring the spans during the planing process. Maybe it could be something like

...
// set the `current_span` field in the planner context
// if sql has a span, otherwise use the existing span
let planner_context = planner_context.with_span(&sql);
self.sql_expr_to_logical_expr(sql.clone(), schema, planner_context)?;
...

Copy link
Author

@eliaperantoni eliaperantoni Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very interesting idea. I see a few issues with it though:

What if sql is a compound expression like a + b but only b is unresolved? Ideally, I would want sql_expr_to_logical_expr to be able to return diagnostics that highlight b only. But how does it do that if it only know the Span of the entire sql expression?

What if a diagnostic needs to highlight various parts of a query. e.g. if a non-aggregated column is missing from the GROUP BY clause, a good diagnostic would highlight both the column and the GROUP BY clause. But planner context has only one Span, and which is the "current span" in this case?

I thought that I could solve the above issues by putting a HashMap<String, Vec<Span>> in PlannerContext. The key would be to label different parts of the query, e.g. group_by_clause, set_operator, left_side_of_expr, etc.. So that I could do something like:

error.with_diagnostic(Diagnostic::new().with_error(
    "column missing from GROUP BY",
    planner_context.span("expr").last(), // Highlight the column
).with_note(
    "GROUP BY is here",
    planner_context.span("group_by").last(), // Highlight the GROUP BY
))

The Vec<_> is just to have stacks, so that in something like SELECT ... UNION (SELECT ... UNION SELECT ...) the set_operator key could point to the outer UNION, then the inner one, then restore the outer one when we're done planning the query in parentheses.

But then I encountered another issue:

Some functions where PlannerContext is threaded through (e.g. sql_expr_to_logical_expr and sql_expr_to_logical_expr_internal) take &mut PlannerContext. What if I need to add new spans?

For example, in the body of sql_expr_to_logical_expr I would like to do:

image

But, what if sql_expr_to_logical_expr_internal needs to mutate the given &mut PlannerContext? I can't simply clone PlannerContext and add a new span because I then break the ability to mutate downstream.

Then it seems like a pattern like:

pattern_context.with_span("expr", sql_expr.span(), move |planner_context| {
    let expr = self.sql_expr_to_logical_expr_internal(
        *sql_expr,
        schema,
        planner_context,
    )?;
});

might be necessary, to add a Span to the given PlannerContext but also remove it. Idk if that's less invasive than adding Span to Expr though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it's validate_schema_satisfies_exprs that throws the error, not sql_expr_to_logical_expr. And in that place I have access to the SQL expression, but the problem still is that I don't know which subexpression was the problem

@github-actions github-actions bot added documentation Improvements or additions to documentation sqllogictest SQL Logic Tests (.slt) labels Jan 24, 2025
@eliaperantoni
Copy link
Author

Thanks! I tried fixing the failing CI tests

@eliaperantoni
Copy link
Author

Ah sorry @alamb those tests didn't fail before. Is there any way I can trigger CI myself?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @eliaperantoni -- I reallly like how this PR is looking now.

in my opinion this PR could almost be merged as is because:

  1. The features are disabled by default
  2. The tests are wonderful

The things blocking in my mind are:

  1. A few comments on the new pub struct
  2. Removing the new clones in SqlToRel (I left comments)

I also think before/part of merging this PR would be to document what is needed before we would want to turn on collect_spans by default

I will also run the sql planning benchmarks just to make make sure there is no planning performance regression

cc @jonahgao and @comphead for your thoughts

datafusion/sql/tests/cases/diagnostic.rs Show resolved Hide resolved
@@ -33,23 +33,23 @@ statement error DataFusion error: SQL error: ParserError\("Missing LOCATION clau
CREATE EXTERNAL TABLE t STORED AS CSV

# Option value is missing
statement error DataFusion error: SQL error: ParserError\("Expected: string or numeric value, found: \)"\)
statement error DataFusion error: SQL error: ParserError\("Expected: string or numeric value, found: \) at Line: 1, Column: 66"\)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats really cool, does that work for multiline statements?

Copy link
Contributor

@comphead comphead Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking should be that parser responsibility rather than the runtime engine's 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it does work for multiline statements, and it's the parser that outputs that error.

WITH cte1 AS (
    SELECT 1 AS id
), cte2 AS (
    SELECT 1 AS id
),
SELECT id FROM cte1, cte2, users
-- SQL error: ParserError("Expected: AS, found: id at Line: 6, Column: 8")

datafusion/common/src/column.rs Outdated Show resolved Hide resolved
datafusion/common/src/column.rs Show resolved Hide resolved
@@ -41,3 +41,4 @@ arrow = { workspace = true }
datafusion-common = { workspace = true }
itertools = { workspace = true }
paste = "^1.0"
sqlparser = { workspace = true }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this new dependecy is also unfortunate, but I see it is required as the BinaryTypeCoercer needs to know about spans

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessary anymore since, after your feedback, I made a semi-copy of Span in datafusion-common 🥳

datafusion/expr/src/expr.rs Show resolved Hide resolved
op: &'a Operator,
rhs: &'a DataType,

lhs_spans: Spans,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about simply having a single spans: Span that is computed on construction? Is there any reason to keep around the different set of spans?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the spans of both sides individually though, to produce diagnostics like the following:

image

Maybe it's not super clear from the screenshot but each note: has type X is highlighting just one side of the expression.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are also merged for the overarching error though, that's true

datafusion/expr/src/expr_rewriter/mod.rs Outdated Show resolved Hide resolved
datafusion/sql/src/expr/mod.rs Outdated Show resolved Hide resolved
datafusion/sql/src/set_expr.rs Outdated Show resolved Hide resolved
@@ -899,7 +899,7 @@ logical_plan
07)--------Unnest: lists[__unnest_placeholder(outer_ref(u.column1))|depth=1] structs[]
08)----------Projection: outer_ref(u.column1) AS __unnest_placeholder(outer_ref(u.column1))
09)------------EmptyRelation
physical_plan_error This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn(List(Field { name: "item", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), Column { relation: Some(Bare { table: "u" }), name: "column1" })
physical_plan_error This feature is not implemented: Physical plan does not support logical expression OuterReferenceColumn(List(Field { name: "item", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), Column { relation: Some(Bare { table: "u" }), name: "column1", spans: Spans([]) })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if Spans brings a purpose here? Should we exclude it from error message, otherwise it might be confusing

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah in general I think I agree that we don't need Column::Debug to print the spans. I'll change it.

@matthewmturner
Copy link
Contributor

Thanks for the ping @xudong963. Indeed, this is really cool - I haven't had the chance to read the whole thread yet but from what I see I do think this could be a good addition to dft. I've created an issue for it

@eliaperantoni
Copy link
Author

@alamb fixed ci issues

@eliaperantoni
Copy link
Author

I also think before/part of merging this PR would be to document what is needed before we would want to turn on collect_spans by default

In my opinion:

  • Putting Span on more logical types in order to..
  • Attach Diagnostic to more errors, until most of the typical errors that we can expect the end user to make are covered

I know it's a little vague but, as a starting point, I would consider:

  • Reference to a function that doesn't exist
  • Wrong number of parameters in a function call
  • Wrong types of parameters in a function call
  • Syntax errors
  • Subquery that returns more than one column
  • Type coercion in unary expressions
  • Using = NULL instead of IS NULL, maybe that could be the first use case of Diagnostic::new_warning
  • Alias two columns/tables with the same name

@eliaperantoni eliaperantoni requested a review from alamb January 29, 2025 08:36
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @eliaperantoni and @comphead (and @mkarbo)!

I think this PR looks really nice and is a great addition to DataFusion

I suggest we wait for the DataFusion 45 release candidate (likely later this week early next week) and merge this PR in once that has been done.

Once we merge someone (I will do it if no one else beats me to it) should then file a ticket like [EPIC]: Improved error source tracking / messages listing the various subtasks you highlighted on #13664 (comment)

FYI @sadboy and @findepi -- this feature might be of interest to SDF as well.

I am very excited to see this happen. Thank you for pushing it through

Some::<TableReference>(subqry_alias.into()),
name,
)),
Expr::Column(col) => Ok(col.with_relation(subqry_alias.into())),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍

@@ -90,12 +90,35 @@ pub(crate) fn rebase_expr(
.data()
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is quite nice

@alamb
Copy link
Contributor

alamb commented Jan 29, 2025

I merged up from main for the PR to get a clean CI test run

@alamb
Copy link
Contributor

alamb commented Feb 1, 2025

Update here is we are very close to cutting the 45 release branch. See

Once we do that I'll plan to merge this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
common Related to common crate core Core DataFusion crate documentation Improvements or additions to documentation logical-expr Logical plan and expressions optimizer Optimizer rules physical-expr Physical Expressions sql SQL Planner sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add related source code locations to errors
6 participants