Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize getFilesAndIndicesForDimensions #361

Merged
merged 26 commits into from
Sep 3, 2024

Conversation

mgrunbauer
Copy link
Collaborator

@mgrunbauer mgrunbauer commented Apr 8, 2024

This MR makes the following changes:

  • When requesting data with multiple dimensions, instead of executing queries per dimension, execute a single query for all dimensions. This is happening in getFilesAndIndicesForDimensions and getTableNameForPathFilterAndDimensions.
  • Check if pathfiltertablelookup table exists only once during adaguc-server lifetime, instead of once per table access.
  • SQLite will now return dim values in the same format than PSQL (e.g. always 10.0 instead of 10)
  • You can now run all tests against the postgresql backend.
  • The maxquerylimit limit now limits the total results from getFilesAndIndicesForDimensions() instead of limiting on the first dimension queried. This fixes: Simplify query used by CDBAdapterPostgreSQL::getFilesAndIndicesForDimensions #341

All changes combined result in less queries. For example GetMap required 12 queries before, only 3 queries with this MR.

This slightly improves performance, running the complete benchmark locally takes 140 seconds on master, 128 seconds on this branch.

I've replaced some of the getTableNameForPathFilterAndDimension() calls with getTableNameForPathFilterAndDimensions() (plural dimensions). The former is now a wrapper function that executes the latter, so the code path is identical. Replacing the remaining getTableNameForPathFilterAndDimension calls requires some refactoring.


Things out of scope for this PR:

  • SQLite will not implement the changes made to the PostgreSQL class. SQLite might become deprecated.
  • Fix test dependencies when running tests using PostgreSQL. I worked around this with a "temporary band aid" (drop all tables).
  • "query hidden" with SHOW_QUERYINFO is not required for now.
  • Postgres is still using real to store floats, which will result in precision loss. We should probably use double precision. I've decided to skip it in this MR for now.

@mgrunbauer mgrunbauer force-pushed the optimize-getFilesAndIndicesForDimensions branch from b47cc1c to 942a08c Compare April 14, 2024 13:51
@mgrunbauer mgrunbauer force-pushed the optimize-getFilesAndIndicesForDimensions branch from f9b79e4 to 5c39b66 Compare June 8, 2024 11:33
@mgrunbauer mgrunbauer marked this pull request as ready for review July 31, 2024 20:55
Copy link
Member

@maartenplieger maartenplieger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!!!!! I am deploying a dev version for geoweb to spot any issues.

Dockerfile Outdated Show resolved Hide resolved
Dockerfile Show resolved Hide resolved
adagucserverEC/CDBAdapterPostgreSQL.cpp Outdated Show resolved Hide resolved
adagucserverEC/CDBAdapterPostgreSQL.cpp Outdated Show resolved Hide resolved
adagucserverEC/CDBAdapterPostgreSQL.cpp Outdated Show resolved Hide resolved
adagucserverEC/CDBAdapterSQLLite.cpp Show resolved Hide resolved
adagucserverEC/CRequest.cpp Show resolved Hide resolved
adagucserverEC/CRequest.cpp Show resolved Hide resolved
doc/developing/testing.md Show resolved Hide resolved
tests/starttests_psql.sh Outdated Show resolved Hide resolved
@maartenplieger maartenplieger merged commit e9a14a7 into master Sep 3, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Simplify query used by CDBAdapterPostgreSQL::getFilesAndIndicesForDimensions
2 participants