Skip to content

Commit

Permalink
Editorial changes
Browse files Browse the repository at this point in the history
  • Loading branch information
gwlucastrig committed Nov 22, 2024
1 parent 797e85d commit 635c055
Showing 1 changed file with 8 additions and 6 deletions.
14 changes: 8 additions & 6 deletions docs/notes/EntropyMetricForDataCompressionCaseStudies.html
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,8 @@ <h1>Introduction</h1>
<a href="https://en.wikipedia.org/wiki/Huffman_coding">Huffman coding</a> and
<a href="https://en.wikipedia.org/wiki/Deflate">Deflate</a>. It also supports the use
of experimental or application-specified data compression techniques, but those are
outside the scope of this article.</p>
outside the scope of this article. For more information on Gridfour's data compression
implementations, see our article on <a href="GridfourDataCompressionAlgorithms.html">algorithms for raster data compression</a>.</p>

<h2>The entropy calculation</h2>
<p>The entropy values reported in the discussion below is based on Claude Shannon's first-order
Expand Down Expand Up @@ -194,15 +195,16 @@ <h2>Statistical variation over the domain of a data set</h2>
Deflate: 64 % of all tiles
Huffman: 36 % "" ""
Average entropy of source data as selected for method:
Deflate: 4.89 bits/elevation
Huffman: 3.35 "" ""
Deflate: 6.16 bits/elevation
Huffman: 5.83 "" ""
</pre>

<p>The fact that the Huffman method is selected so often may seem counterintuitive
because the Deflate technique usually out performs Huffman by a substantial margin.
In this case, it appears that Huffman can produce smaller output than Deflate
for subsets of the data product where the source entropy is low and the tendency
of the data to exhibit repetitive patterns is reduced.</p>
The reason for this unexpected result is not clear at this time and will
be the subject of further investigation.
Since Deflate relies on patterns in the data, it is likely that Huffman is
preferred in tiles where repetitive patterns were absent or difficult to detect.</p>

<p>For more information on the data compression logic used to produce these results, see
<a href="https://gwlucastrig.github.io/GridfourDocs/notes/GridfourDataCompressionAlgorithms.html">
Expand Down

0 comments on commit 635c055

Please sign in to comment.