Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-enable SparseCompressor #2246

Open
a10y opened this issue Feb 5, 2025 · 0 comments
Open

Re-enable SparseCompressor #2246

a10y opened this issue Feb 5, 2025 · 0 comments

Comments

@a10y
Copy link
Contributor

a10y commented Feb 5, 2025

Currently SparseCompressor is disabled for most input types. It will ONLY accept another SparseArray, and will re-compress the patches.

Our SparseArray is more or less identical to the Frequency encoding from btrblocks (which is a simplified version of Frequency encoding from DB2). As an example of somewhere it would be useful to compress using the Sparse array, consider the following compression pathway we use currently in ClickBench for an i32 array with 999,777 instances of -1, and 223 instances of some 103 unique values:

Image

While I don't have a solid benchmark, I can imagine this is causing a lot of extraneous overhead on the read pathway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant