SQL Optimizer - not choosing the better join order #137701

sheaffej · 2024-12-18T19:19:25Z

Describe the problem

The optimizer seems to be missing a join reordering opportunity that would avoid a full scan.

This issue was originally reported in Zendesk https://cockroachdb.zendesk.com/agent/tickets/24877
The Zendesk ticket includes statement bundles from the customer's cluster, and the un-redacted SQL.

This query below has a full scan on join_requests returning 1,087,393 rows, then a lookup join on projects which is an outer join. This lookup join is where all the time is being spent. Finally, the 820,399 rows from the lookup join are hash joined to teams. However, the scan on teams results in 1 row, and the hash join results in 0 rows.

sql>       SELECT
  ->         sj0.id AS id
  ->       FROM
  ->         join_requests AS sj0
  ->         LEFT JOIN projects AS sp1 ON sp1.id = sj0.project_id
  ->         INNER JOIN teams AS st2 ON st2.id = sj0.team_id
  ->       WHERE
  ->         (st2.deleted_at IS NULL)
  ->         AND ((sp1.id IS NULL) OR (sp1.deleted_at IS NULL))
  ->         AND st2.account_id = '<redacted>':::UUID
  -> ;
  id
------
(0 rows)

Time: 5.721s total (execution 5.641s / network 0.080s)

If I simply switch the order that projects and teams are listed in the SQL, the subquery returns in 6ms. This scans teams and results in 1 row, which is lookup joined to join_requests resulting in 0 rows.

sql>       SELECT
  ->         sj0.id AS id
  ->       FROM
  ->         join_requests AS sj0
  ->         INNER JOIN teams AS st2 ON st2.id = sj0.team_id
  ->         LEFT JOIN projects AS sp1 ON sp1.id = sj0.project_id
  ->       WHERE
  ->         (st2.deleted_at IS NULL)
  ->         AND ((sp1.id IS NULL) OR (sp1.deleted_at IS NULL))
  ->         AND st2.account_id = '<redacted>':::UUID;
  id
------
(0 rows)

Time: 90ms total (execution 6ms / network 83ms)

To Reproduce

Statement bundles of both versions of the original query are included in the Zendesk ticket https://cockroachdb.zendesk.com/agent/tickets/24877

The query described above is just one subquery within a larger SQL. The statement bundles in the Zendesk ticket include the full original SQL.

I checked the reorder_joins_limit which was set to the default. When I set reorder_joins_limit = 10, there was no improvement in the query or plan using the original join order.

Expected behavior
I expected the optimizer to consider the join reorder without a SQL modiication.

Environment:

CockroachDB version: v24.2.5
Cockroach Cloud AWS Advanced cluster
- Single-region us-east-1 with 12 nodes

Jira issue: CRDB-45710

gz#24877

The text was updated successfully, but these errors were encountered:

blathers-crl · 2024-12-18T19:19:27Z

Hi @sheaffej, please add branch-* labels to identify which branch(es) this C-bug affects.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

sheaffej added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-sql-optimizer SQL logical planning and optimizations. T-sql-queries SQL Queries Team labels Dec 18, 2024

github-project-automation bot added this to SQL Queries Dec 18, 2024

github-project-automation bot moved this to Triage in SQL Queries Dec 18, 2024

sheaffej added the branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 label Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SQL Optimizer - not choosing the better join order #137701

SQL Optimizer - not choosing the better join order #137701

sheaffej commented Dec 18, 2024 •

edited by RoachietheSupportRoach

Loading

blathers-crl bot commented Dec 18, 2024

SQL Optimizer - not choosing the better join order #137701

SQL Optimizer - not choosing the better join order #137701

Comments

sheaffej commented Dec 18, 2024 • edited by RoachietheSupportRoach Loading

blathers-crl bot commented Dec 18, 2024

sheaffej commented Dec 18, 2024 •

edited by RoachietheSupportRoach

Loading