-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Fractional-Power-of-Two Window Sizes in Compression #4275
Draft
felixhandte
wants to merge
9
commits into
facebook:dev
Choose a base branch
from
felixhandte:win-frac
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Even though the `ZSTD_compressionParameters` struct is part of the unstable API, and therefore we are allowed to modify it, it seems likely that to actually do so would cause widespread chaos and bloodshed. So we are strongly incentivized to avoid changing it. We would, however, like to modify the CParams that are actually passed around internally. In particular, we want to be able to configure non-power-of-2 window sizes. So we need something more expressive than an integral window log. And we want this new field to be next to the window log, rather than hanging out in the CCtxParams, which are not passed to the block compressors. So, in order to support that, this commit: 1. Introduces a new struct, the `ZSTD_CParams`, that (for the moment) mirrors the definition of the public `ZSTD_compressionParameters` struct. 2. Codemods all internal use and storage of `cparams` internally to use the new struct definition. (The exception to this is the `ZSTD_parameters` struct, which is a public definition but is also passed around internally sometimes.) 3. Adds translation functions to convert the public and private struct defs to each other. 4. Uses those translation functions at the user API boundary (and when handling `ZSTD_parameters`).
This commit extends the refactor done in the previous and applies it to the `ZSTD_parameters` public struct, introducing an internal variant called `ZSTD_Params` whose only difference is containing a `ZSTD_CParams` instead of a `ZSTD_compressionParameters`. This commit similarly introduces conversion functions and rewrites internal functions to use the new, private definition. This allows us to clean up some internal conversions of the private `CParams` back into the public `CParams` in order to put them in a `ZSTD_parameters` struct to pass into something.
Store it in the CParams. Don't use it yet.
Using a blocksize of 100KB with a 65KB dictionary just *happens* to work with the current window-sizing logic, because it has to round all the way up to the next power of two. But it doesn't *have* to work. So this pre-emptively fixes up this test to use a size that will work after switching to tighter window sizing, but still uses a block size that is bigger than the dict size in case that matters for some reason?
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds support for configuring and performing compression with non-power-of-two window sizes.
The frame format and decoder have always supported more window sizes than just powers of two. The frame format defines a 3 bit window mantissa field in addition to the 4 bit window log descriptor. However, the compressor has only ever supported power-of-two windows. This PR changes that. In order to do that, it makes a number of changes:
windowFrac
field to the private CParams (but not to the public one, because we are scared of changing that struct definition.ZSTD_c_windowFrac
which allows users to set this param.To-Do: