Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Join recursive cte to base #880

Merged
merged 18 commits into from
Jul 16, 2020
Merged

Join recursive cte to base #880

merged 18 commits into from
Jul 16, 2020

Conversation

bojanserafimov
Copy link
Collaborator

@bojanserafimov bojanserafimov commented Jul 14, 2020

Currently we generate queries that have the following from clause FROM anon_1, anon_2, where anon_1 is the query before a recurse, and anon_2 is the recursive CTE. Since the two ctes are not joined, if outputs are selected from each, the result will be a cartesian product, which is incorrect.

There's two possible solutions:
a) Never select outputs from anon_1, since the same values can be accessed from anon_2
b) Join the ctes in the FROM clause

I went with option (b) since it makes it possible to implement recursion inside optional scope.

Note: When no outputs are selected from anon_1, there's no need to join, since anon_1 can be excluded from the FROM clause completely. This optimization can be implemented later, and I'm not sure if it makes a difference in runtime.

Changes in this PR:

  • Join the recursive CTE to the preceding CTE or alias
  • Make sure this join does not violate @optional semantics if the @recurse is in optional scope
  • Test
  • Increase mypy strictness

This PR fixes #866

@codecov
Copy link

codecov bot commented Jul 15, 2020

Codecov Report

Merging #880 into main will increase coverage by 0.07%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #880      +/-   ##
==========================================
+ Coverage   94.75%   94.83%   +0.07%     
==========================================
  Files         108      108              
  Lines        8432     8580     +148     
==========================================
+ Hits         7990     8137     +147     
- Misses        442      443       +1     
Impacted Files Coverage Δ
graphql_compiler/compiler/emit_sql.py 98.48% <100.00%> (+0.01%) ⬆️
...ql_compiler/schema_transformation/rename_schema.py 99.43% <0.00%> (-0.57%) ⬇️
graphql_compiler/cost_estimation/analysis.py 99.05% <0.00%> (+0.01%) ⬆️
...mpiler/compiler/ir_lowering_gremlin/ir_lowering.py 96.64% <0.00%> (+0.02%) ⬆️
graphql_compiler/schema_transformation/utils.py 95.41% <0.00%> (+0.03%) ⬆️
graphql_compiler/global_utils.py 97.56% <0.00%> (+0.12%) ⬆️
graphql_compiler/cost_estimation/statistics.py 93.05% <0.00%> (+0.19%) ⬆️
graphql_compiler/schema/schema_info.py 89.58% <0.00%> (+0.22%) ⬆️
...l_compiler/query_pagination/pagination_planning.py 92.06% <0.00%> (+0.26%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b11dbdc...fbe8878. Read the comment docs.

@@ -743,7 +743,7 @@ def __init__(self, sql_schema_info: SQLAlchemySchemaInfo, ir: IrAndMetadata):
self._relocate(ir.query_metadata_table.root_location)

# Mapping aliases to the column used to join into them.
self._came_from: Dict[Alias, Column] = {}
self._came_from: Dict[Union[Alias, ColumnRouter], Column] = {}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mention of aliases in the docstring is outdated. So is the entire _aliases field, since it doesn't only contain aliases anymore. I'm working on fixing this in the next PR, and I created an issue #882

@bojanserafimov bojanserafimov marked this pull request as ready for review July 16, 2020 15:36
Bojan Serafimov and others added 3 commits July 16, 2020 11:43
@obi1kenobi obi1kenobi merged commit 8d69f4e into main Jul 16, 2020
@obi1kenobi obi1kenobi deleted the cte_join branch July 16, 2020 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect sql output for recurse
2 participants