Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad index plan for lookup on NULL #8885

Open
nicktobey opened this issue Feb 23, 2025 · 2 comments
Open

Bad index plan for lookup on NULL #8885

nicktobey opened this issue Feb 23, 2025 · 2 comments
Labels
analyzer sql Issue with SQL

Comments

@nicktobey
Copy link
Contributor

create table test(pk int primary key, c0 int, key idx1(c0, pk));
insert into test values (1, 0), (2, 1), (3, 2), (4, 1), (5, 2), (6, 0), (7, 2), (8, 0), (9, 1), (10, NULL);
describe plan select * from test where c0 is NULL and pk > 1;

Expected plan:

+------------------------------------------+
| plan                                     |
+------------------------------------------+
| IndexedTableAccess(test)                 |
|  ├─ index: [test.c0,test.pk]             |
|  ├─ filters: [{[NULL, 1], [NULL, ∞)}] |
|  └─ columns: [pk c0]                     |
+------------------------------------------+

Actual plan:

+------------------------------+
| plan                         |
+------------------------------+
| Filter                       |
|  ├─ test.c0 IS NULL          |
|  └─ IndexedTableAccess(test) |
|      ├─ index: [test.pk]     |
|      ├─ filters: [{(1, ∞)}]  |
|      └─ columns: [pk c0]     |
+------------------------------+

We should be able to use idx1 to implement the query with a single indexed table access, no filter required. And if the query uses a non-NULL value for c0, we do. But whenc0 is being compared to NULL, the engine won't pick the secondary index.

@nicktobey nicktobey added analyzer sql Issue with SQL labels Feb 23, 2025
@nicktobey
Copy link
Contributor Author

Digging into the index coster, I'm seeing some odd results:

  • Both the primary and the secondary index have a nil histogram, which results in them both have an estimated row count of 0.
  • Both the primary and the secondary index have a computed prefix length of 0, which seems inaccurate for both.
  • The stats for the secondary index says that there are 9 distinct rows and 0 null rows, which unless I'm misunderstanding how those are calculated for multi-column indexes where only some of the columns are NULL, also seems odd.

Tagging @max-hoffman since he may have additional context about what the coster is doing here.

@nicktobey
Copy link
Contributor Author

Okay, the issue seems to be that conjCollector::add has a case for null safe equality, but no case for "IS NULL", causing it to treat "IS NULL" expressions like an inequality and preventing the coster from considering them in a prefix.

As a workaround, replacing IS NULL in the query with <=> NULL allows the coster to pick the correct index.

This still doesn't explain why the coster is using an estimated row count of 0 for both indexes, since it seems to be causing them to be rated higher than an index with a non-zero estimated row count.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analyzer sql Issue with SQL
Projects
None yet
Development

No branches or pull requests

1 participant