-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(data-warehouse): Override external data table fields #20997
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a look, and seems good. I'm not sure about one part through (replied inline).
Also, we now have hidden
aliases and hidden
fields, where the meaning of "hidden" is slightly different. I wonder if we should disambiguate? 🤔 🤷
"object": StringDatabaseField(name="object"), | ||
"address": StringJSONDatabaseField(name="address"), | ||
"balance": IntegerDatabaseField(name="balance"), | ||
"__created": IntegerDatabaseField(name="created", hidden=True), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I can imagine users having some of their own tables with __
-prefixed fields, and someone someday having some conflict, if we expand this further or make it automatic. Should we be even harsher with something like:
"__x_created": IntegerDatabaseField(name="created", hidden=True),
No strong feelings, but thought it worth mentioning.
posthog/hogql/visitor.py
Outdated
def visit_expression_field_type(self, node: ast.ExpressionFieldType): | ||
self.visit(node.expr) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this is correct. In the resolver, when we resolve expression fields, we clone this node.expr
and swap out the node with ExpressionFieldType with the cloned expr.
The change here probably has some side effects, and/or might mutate the "reference expression" instead of cloning and applying. Not sure, but it might be worth double checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, yeah, some tests were failing due to TraversingVisitor
not having a visit_expression_field_type
func - we do pass
on some other _type
funcs, which may be better here than visiting the expression
fe985ab
to
1f56661
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! 👍
The only thing you should keep in mind is that the expression fields aren't perfect (finished). The code that swaps the field out with the expression does a very simple clone_expr
operation. This means that under certain conditions, the expression field will "resolve" to garbage.
For example this nonsense query:
select *
from stripe_invoices a
left join stripe_invoices b
on a.amount_paid = b.amount_paid
where a.period_start_at < b.period_start_at
will be swapped with:
select *
from stripe_invoices a
left join stripe_invoices b
on a.amount_paid = b.amount_paid
where __period_start < __period_start
I think to solve it, we should update the cloning code here, and push a fake ast.SelectQueryType
onto self.scopes
(with just that one table) before visiting and pop it after. That should give everything inside the expression field the right type, and they should get the a.__period_start
added automatically.
Problem
DateTime
typeChanges
*
*
queries (effectively reverting feat(query-engine): Remove expression fields from asterisk expansion #19239)stripe_customer
table as an examplecreated
field from stripe (a unix timestamp) intocreated_at
(aDateTime
clickhouse type)How did you test this code?