Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore metadata shorthand for non-zero bit patterns #222

Open
tasket opened this issue Nov 14, 2024 · 0 comments
Open

Explore metadata shorthand for non-zero bit patterns #222

tasket opened this issue Nov 14, 2024 · 0 comments

Comments

@tasket
Copy link
Owner

tasket commented Nov 14, 2024

Wyng currently marks all-zero chunks in the manifest without saving a corresponding data chunk file. This might be expanded to other patterns consisting of 8, 16, or 32 bits.

The send-time test for such chunks could be simple: Check for the chunk's first byte(s) being repeated for the remainder of the chunk. This would quickly complete after comparing first few bytes for the vast majority of data chunks.

Before considering implementation, scan some volumes to generate histograms of different patterns and pattern sizes to see if the space savings could be substantial.

@tasket tasket added this to the v1.0 milestone Nov 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant