[Website] Aggregating Millions of Groups Fast in Apache Arrow DataFusion 28.0.0 #386
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes apache/datafusion#6988
Note: This describes work @tustvold @Dandandan and I did in DataFusion 28.0.0. This content was originally published on the InfluxData Blog but since it is general applicable to Apache Arrow DataFusion I would like to syndicate it here becase:
This is the same model we followed with https://arrow.apache.org/blog/2022/12/26/querying-parquet-with-millisecond-latency/ which was also republished on the arrow blog after the InfluxData blog
It also gives me an example to use my original ASCII art diagrams :)