DuckDB: Add in-memory results #274

szarnyasg · 2024-11-27T16:15:25Z

This PR add DuckDB v1.1.3 in-memory results for a run on a c6a.metal instance.

rschu1ze · 2024-11-27T20:21:51Z

duckdb-memory/query.py

+end = timeit.default_timer()
+print(end - start)
+
+with open('queries.sql', 'r') as file:


In the ClickBench repository,

in duckdb-memory/, the load and query steps are combined (this file), whereas

in duckdb/, the load and query steps are split (load.py and query.py).

I guess it would be easier to figure out the difference between both variants if this is made consistent?

@rschu1ze Thanks for the feedback!

The reason the duckdb-memory implementation combines load and queries in a single script is that DuckDB runs in-process so if there is no on-disk persistence, we have to run the load and the queries in the same process to keep the data in memory.

There are many DuckDB implementations for ClickBench now, so unifying them would make sense. The likely approach will be using a simple Python script that uses systems calls (such as https://github.com/ClickHouse/ClickBench/pull/274/files#diff-1d8a1d4c9a1a7c7e98e5ac68a2ad1c33b9b7125f482922a040026bfbc2976cffR27) to enforce cache eviction. Would this approach work for the duckdb/ implementations?

Yes, that approach would work and a unification would generally make sense. But let's do this separately (--> new PR).

rschu1ze · 2024-11-28T13:24:32Z

duckdb-memory/query.py

+end = timeit.default_timer()
+print(end - start)
+
+with open('queries.sql', 'r') as file:


Yes, that approach would work and a unification would generally make sense. But let's do this separately (--> new PR).

rschu1ze · 2024-11-28T14:47:04Z

I was able to reproduce the measurements on an 'c6a.metal' instance - thanks.

DuckDB: Add in-memory results

9a2e8f7

rschu1ze reviewed Nov 27, 2024

View reviewed changes

rschu1ze approved these changes Nov 28, 2024

View reviewed changes

rschu1ze merged commit 25d51fa into ClickHouse:main Nov 28, 2024

szarnyasg deleted the duckdb-v1.1.3-in-memory branch November 29, 2024 09:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DuckDB: Add in-memory results #274

DuckDB: Add in-memory results #274

szarnyasg commented Nov 27, 2024

rschu1ze Nov 27, 2024

szarnyasg Nov 28, 2024

rschu1ze Nov 28, 2024

rschu1ze Nov 28, 2024

rschu1ze commented Nov 28, 2024

DuckDB: Add in-memory results #274

DuckDB: Add in-memory results #274

Conversation

szarnyasg commented Nov 27, 2024

rschu1ze Nov 27, 2024

Choose a reason for hiding this comment

szarnyasg Nov 28, 2024

Choose a reason for hiding this comment

rschu1ze Nov 28, 2024

Choose a reason for hiding this comment

rschu1ze Nov 28, 2024

Choose a reason for hiding this comment

rschu1ze commented Nov 28, 2024