-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MotherDuck-enabled pg_duckdb results. #272
Conversation
…otherduck enabled for simplicity sake.
This comment was marked as resolved.
This comment was marked as resolved.
@@ -0,0 +1,57 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran benchmark.sh on my local EC2 c6a.4xlarge machine and got these numbers:
[0.310832,0.135982,0.137973],
[0.312424,0.145676,0.14565],
[0.367049,0.16804,0.168023],
[1.74935,0.171891,0.171829],
[2.29114,0.523813,0.520013],
[2.46478,0.777383,0.77676],
[2.00442,1.99079,2.01413],
[0.303728,0.147151,0.147396],
[2.38651,0.605923,0.606097],
[2.90069,0.794022,0.801774],
[1.75951,0.256617,0.260653],
[2.21079,0.285738,0.287225],
[2.74292,0.699291,0.696846],
[5.23709,1.02618,1.03795],
[2.6506,0.745238,0.746417],
[1.82356,0.585969,0.590522],
[5.35414,1.38093,1.42019],
[5.23847,1.32079,1.32186],
[9.49408,6.62594,6.65811],
[1.27376,0.17786,0.174985],
[20.2344,1.9744,1.94168],
[23.1244,1.78669,1.7944],
[44.4286,3.60808,3.55152],
[112.15,9.57462,9.56068],
[5.88666,1.09089,1.09729],
[2.51667,0.43618,0.439071],
[6.04578,1.09779,1.08893],
[19.925,1.7998,1.83581],
[17.4306,11.204,11.2168],
[0.548479,0.509858,0.518488],
[5.32779,0.772682,0.77066],
[12.1051,0.878097,0.871254],
[11.2436,6.14124,6.19371],
[21.214,5.52945,5.42852],
[21.1378,5.38582,5.53328],
[1.19671,0.657055,0.665458],
[0.407409,0.323923,0.322465],
[0.459891,0.279551,0.275829],
[0.458786,0.219163,0.218257],
[0.752248,0.472317,0.482665],
[0.357792,0.161885,0.161701],
[0.33603,0.146883,0.146818],
[0.566736,0.384091,0.384248],
Some of the hot runs (2nd + 3rd measurement) on my machine took > 1 sec (the maximum was >11 sec for Q29) whereas in your measurements, (almost) all queries finish well under one sec. The affected queries are all scan/IO-heavy, i.e. they don't have selective filters (WHERE) which could be handled using indexes.
I am totally fine with merging this PR, I just like to understand what caused the difference. Is the workload somehow split between the local machine and Motherduck (i.e. some kind of hybrid execution). In that case, I guess a different local machine (e.g. more cores, faster IO, etc.) could cause this - l. 4 doesn't specify which machine you used for your measurements. And this is really just speculation but does the cloud component (Motherduck) perhaps provide more or less resources (threads, IO, etc.) based on the time of the day? (after all, the free tier is used for which such a behavior would make sense).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me investigate. This looks like a bug (that I've seen a couple of times and it went away) where performance got worse and worse over time, especially for the cold runs. I want to get a handle on it before submitting, since I don't think it is a good idea to submit numbers so far off from what you can reproduce on your own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rschu1ze Thanks for taking a look. Since it's Thanks Giving in the US, @jtigani asked me to take a look so you won't be slowed down.
Some of the difference can be explain by the location of your EC2 machine - our backends are in us-east-1 , can you share where you run your EC2 instance?
More importantly - I tried to find your run in our backends and other than you creating the pgclick
through our UI, I couldn't find anything. This may sounds stupid, but I want to double check that you've set the environment variable MOTHERDUCK_TOKEN
before running the script? It really feels like you have by accident stored the data in postgres and not in MotherDuck - that would explain a significant slow down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To double-check, I setup myself a c6a.4xlarge in us-east-1 and ran the benchmark again:
[0.137005,0.018994,0.012908],
[0.140741,0.018078,0.018248],
[0.134784,0.021106,0.017415],
[0.137873,0.024561,0.021223],
[0.274166,0.150139,0.149283],
[0.288274,0.166713,0.16712],
[0.131532,0.012273,0.013218],
[0.15448,0.016664,0.01604],
[0.311416,0.176908,0.17688],
[0.359286,0.237682,0.234269],
[0.183045,0.059172,0.058513],
[0.189613,0.067289,0.067073],
[0.302605,0.154747,0.155818],
[0.430244,0.301195,0.301791],
[0.282851,0.164777,0.166042],
[0.307493,0.181078,0.181029],
[0.465965,0.35104,0.350157],
[0.439346,0.310988,0.312933],
[0.712783,0.586545,0.588642],
[0.136413,0.014163,0.014281],
[0.404964,0.278645,0.281366],
[0.309573,0.197004,0.1906],
[0.415774,0.293224,0.308011],
[1.25393,1.17285,1.11026],
[0.195441,0.062959,0.060395],
[0.238035,0.060369,0.065979],
[0.192903,0.077515,0.077995],
[0.488857,0.360594,0.339036],
[1.50738,1.25838,1.39167],
[0.839093,0.729768,0.728274],
[0.262534,0.184269,0.185243],
[0.316918,0.19793,0.20941],
[0.895483,0.765861,0.728146],
[0.889801,0.7775,0.766686],
[0.904547,0.787961,0.768901],
[0.360276,0.24354,0.246335],
[0.193414,0.042963,0.043233],
[0.132383,0.021634,0.021998],
[0.15432,0.027969,0.028874],
[0.215762,0.087406,0.08749],
[0.136966,0.01455,0.014039],
[0.133496,0.013182,0.013417],
[0.136268,0.022607,0.016307],
Results are much closer to the submitted results, so the difference seems related to the region.
Regarding MOTHERDUCK_TOKEN
: benchmark.sh (line 22ff) checks that the variable is set. I deleted the token used in the previous run already (I think it was called "test"). For the new run, I used token "clickbench" in my MotherDuck account (mail address: [email protected]).
Anyways, I think we are good ... I'll. merge. Thanks for the help.
Thank You for Your Contribution!
We appreciate your effort and contribution to the project. To ensure that your Pull Request (PR) adheres to our guidelines, please ensure to review the rules mentioned in our contribution guidelines:
ClickHouse/ClickBench Contribution Rules
Thank you for your attention to these details and for helping us maintain the quality and integrity of the project.