-
Do you have any high-level documentation on how S2 compresses data? I can't find any papers published by Google on the topic and your implementation appears to be the only one available to the public. I built a flame graph of
S2 is said to detect and skip over data that isn't compressible and the way this is described makes me believe Snappy never had this sort of capability but when I examined Google's Snappy C code I can see there is heuristic match skipping support in their encoder. I suspect this is further enhanced in S2 to the point that it's worth mentioning as its own feature. It would be great to learn what different techniques S2 improves upon compared to Snappy. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Compression formats are defined by their decompression algorithm. S2 is defined by the Snappy format (blocks/frames) with 3 minor changes. You can an implementation in the Go decoder.
The assembler is generated from pseudo-assembler using avo.
I am not entirely sure what you reference. When I reference snappy, it is the Go package. I believe it is in there after convincing the devs a long time ago. I am pretty sure I also was the one who convinced them to add it to the C version, but I can't find that information anywhere. Please point me to where I state that the Go package doesn't have that, since that is outdated. EDIT: Oh, they mentioned it in the commit.
It is important because many compression users are hesitant to use compression on already compressed material, since many older compressors become very slow on incompressible input. Here it is no big deal, since the compressor will quickly skip it, so it is important developer information. To see the current encoders, you can reference the Go versions: default, better, best. The snappy compatible versions are in the same files. |
Beta Was this translation helpful? Give feedback.
Compression formats are defined by their decompression algorithm. S2 is defined by the Snappy format (blocks/frames) with 3 minor changes.
You can an implementation in the Go decoder.
The assembler is generated from pseudo-assembler using avo.