-
Notifications
You must be signed in to change notification settings - Fork 21
Performance
gtoubassi edited this page Jul 26, 2021
·
22 revisions
Note as of 7/2021, if you are optimizing for performance you may consider Zstd as it is optimized for speed, and supports dictionary compression. It does not appear to be as effective at dictionary compression of small payloads as FemtoZip (see Tutorial).
As of 4/5/2011, performance of femtozip vs gzip/deflate and gzip/deflate+dictionary:
Algorithm | Compression time (millis) | Decompression time (millis) | Compression ratio |
---|---|---|---|
FemtoZip | 439 | 57 | 31.58% |
FemtoZip Level 3 (faster) | 240 | 75 | 32.92% |
FemtoZip No Dict* | 189 | 116 | 94.65% |
GZip | 340 | 98 | 92.92% |
GZip+Dict | 2998 | 382 | 53.08% |
Pure Java FemtoZip | 812 | 263 | 31.38% |
JNI FemtoZip | 659 | 232 | 31.58% |
- FemtoZip No Dict is simply femtozip with no dictionary, which although in practice should not occur, gives an idea how the core algorithm performs when compared with vanilla gzip (comparing FZ with GZ+Dict isn't great apples and apples because GZip+Dict has such poor performance on the compression side
- FemtoZip is faster on decompression. This is attributed to the fact that windowing complexity is eliminated, and more importantly the fact that a huffman tree does not have to be computed on the fly (in fact gzip computes huffman trees 2 ways: custom and default, in order to compare storage tradeoffs since the cost of a custom tree impacts compression rate).
- Default FemtoZip is faster than GZip+Dict on compression, but slower than GZip. The existance of a dictionary slows down compression because more matches need to be pursued. This is to be expected but gives an idea of practical compression performance vs vanilla GZip. FemtoZip No Dict shows what the core compression algorithm does without a dictionary for a more apples/apples comparison. In this case FZ is faster.
- FemtoZip with a compressionLevel of 3 trades off very little compression ratio but outperforms GZip.
- The Java JNI interface needs a serious hosedown. Right now lots of data copies are happening in between. The performance should be much closer to raw FemtoZip then to pure Java.
Assuming your femtozip source clone is in ~/femtozip:
% cd ~/femtozip/scripts
% rm -rf data
% ./perfGenData data
% ./perfBuildModels data
% ./perfRun data
% rm -rf data