-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace func.agdc.common_timestamp with hardcoded conversion #595
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth a try.
What testing has happened on this? What's a before and after example query? With runtimes... |
TLDR: Test query without change: Altered query: Results: |
This seems like a good idea, when we can get SQLAlchemy to behave with the new query. I was confused about why it made such a difference, so dug deeper by comparing the Old Query Plan and the New Query Plan. The first obvious difference, is that the updated query is getting partitioned and run in parallel. It turns out that PostgreSQL is conservative with user defined functions, and they must explicitly be marked safe before being run in parallel. alter function agdc.common_timestamp parallel safe; After updating the function, we now run in parallel, and it's down to 4s vs 6s. I haven't worked out why yet it's doing a Parallel Index Scan instead of a Parallel Bitmap Heap Scan . I suspect it might be the timestamp getting converted to text and parsed back to a timestamp when it's not necessary, possibly because of the Argument data types on the vs Refs: |
Looks like this might be a SQLAlchemy 1.x issue - at any rate outputting So easy to fix in (or with) datacube-core 1.9 but more problematic in 1.8. |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
37d7488
to
bc0287e
Compare
Current use of postgresql function agdc.common_timestamp is adding significant time (4x) to query execution, causing postgresql to queue queries and become unresponsive whilst waiting for database to return.
Function currently returns
select ($1)::timestamp at time zone 'utc';
Rather than call a function to complete this, I've hardcoded this timstamp cast directly in the query expression.
📚 Documentation preview 📚: https://datacube-explorer--595.org.readthedocs.build/en/595/