Upd

purplesyringa · Aug 24, 2024 · 8c2fa74 · 8c2fa74
1 parent b3aa955
commit 8c2fa74
Show file tree

Hide file tree

Showing 2 changed files with 2 additions and 2 deletions.
diff --git a/blog/division-is-hard-but-it-does-not-have-to-be/index.html b/blog/division-is-hard-but-it-does-not-have-to-be/index.html
@@ -135,7 +135,7 @@
     }
     quotient
 }
-</code></pre><p>I’m also going to compare to the <a href=https://lib.rs/strength_reduce/><code>strength_reduce</code> crate</a> to simulate the same optimizations that compilers perform with <code>u64 % u32</code>. I’m compiling with <code>-C target-cpu=native</code>.<div class=table-wrapper><table><thead><tr><th>Test<th>Time/iteration (ns)<th>Speedup<tbody><tr><td><code>modulo_naive</code><td>25.440<td>(base)<tr><td><code>modulo_strength_reduce</code><td>4.9672<td>5.1x<tr><td><code>modulo_optimized</code><td>2.5847<td>9.8x<tr><td><code>reduce</code><td>2.1746<td>11.7x<tr><td><code>divide_naive</code><td>25.460<td>(base)<tr><td><code>divide_strength_reduce</code><td>5.4451<td>4.7x<tr><td><code>divide_optimized</code><td>2.7730<td>9.2x</table></div><p class=next-group><span class=side-header><span>So what?</span></span>In all honesty, this is not immediately useful when applied to rolling hashes. <code>reduce</code> is still a little slower than two <code>u64 % u32</code> computations, so if calculating the hash modulo two 32-bit primes rather than one 64-bit prime suffices for you, do that. Still, if you need the best guaranteed collision rate as fast as possible, this is the way.<p>It’s a free optimization for compilers to perform too. It’s quite possible that I’m not just unfamiliar with practical applications. Also, hey, it’s one more trick you might be able to apply elsewhere now that you’ve seen it.</div></section><footer><div class=viewport-container><h2>Made with my own bare hands (why.)</h2></div></footer><script>window.addEventListener("keydown", e => {
+</code></pre><p>I’m also going to compare to the <a href=https://lib.rs/strength_reduce/><code>strength_reduce</code> crate</a> to simulate the same optimizations that compilers perform with <code>u64 % u32</code>. I’m compiling with <code>-C target-cpu=native</code>.<div class=table-wrapper><table><thead><tr><th>Test<th>Time/iteration (ns)<th>Speedup<tbody><tr><td><code>modulo_naive</code><td>25.440<td>(base)<tr><td><code>modulo_strength_reduce</code><td>4.9672<td>5.1x<tr><td><code>modulo_optimized</code><td>2.5847<td>9.8x<tr><td><code>reduce</code><td>2.1746<td>11.7x<tr><td><code>divide_naive</code><td>25.460<td>(base)<tr><td><code>divide_strength_reduce</code><td>5.4451<td>4.7x<tr><td><code>divide_optimized</code><td>2.7730<td>9.2x</table></div><p class=next-group><span class=side-header><span>So what?</span></span>In all honesty, this is not immediately useful when applied to rolling hashes. <code>reduce</code> is still a little slower than two <code>u64 % u32</code> computations, so if calculating the hash modulo two 32-bit primes rather than one 64-bit prime suffices for you, do that. Still, if you need the best guaranteed collision rate as fast as possible, this is the way.<p>It’s a free (if conditional) optimization for compilers to perform too. It’s quite possible that I’m not just unfamiliar with practical applications. Also, hey, it’s one more trick you might be able to apply elsewhere now that you’ve seen it.</div></section><footer><div class=viewport-container><h2>Made with my own bare hands (why.)</h2></div></footer><script>window.addEventListener("keydown", e => {
 				if (e.ctrlKey && e.key === "Enter") {
 					window.open("https://github.com/purplesyringa/site/edit/master/blog/division-is-hard-but-it-does-not-have-to-be/index.md", "_blank");
 				}

diff --git a/blog/division-is-hard-but-it-does-not-have-to-be/index.md b/blog/division-is-hard-but-it-does-not-have-to-be/index.md
@@ -253,4 +253,4 @@ I'm also going to compare to the [`strength_reduce` crate](https://lib.rs/streng
 
 In all honesty, this is not immediately useful when applied to rolling hashes. `reduce` is still a little slower than two `u64 % u32` computations, so if calculating the hash modulo two 32-bit primes rather than one 64-bit prime suffices for you, do that. Still, if you need the best guaranteed collision rate as fast as possible, this is the way.
 
-It's a free optimization for compilers to perform too. It's quite possible that I'm not just unfamiliar with practical applications. Also, hey, it's one more trick you might be able to apply elsewhere now that you've seen it.
+It's a free (if conditional) optimization for compilers to perform too. It's quite possible that I'm not just unfamiliar with practical applications. Also, hey, it's one more trick you might be able to apply elsewhere now that you've seen it.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -253,4 +253,4 @@ I'm also going to compare to the [`strength_reduce` crate](https://lib.rs/streng

		In all honesty, this is not immediately useful when applied to rolling hashes. `reduce` is still a little slower than two `u64 % u32` computations, so if calculating the hash modulo two 32-bit primes rather than one 64-bit prime suffices for you, do that. Still, if you need the best guaranteed collision rate as fast as possible, this is the way.

		It's a free optimization for compilers to perform too. It's quite possible that I'm not just unfamiliar with practical applications. Also, hey, it's one more trick you might be able to apply elsewhere now that you've seen it.
		It's a free (if conditional) optimization for compilers to perform too. It's quite possible that I'm not just unfamiliar with practical applications. Also, hey, it's one more trick you might be able to apply elsewhere now that you've seen it.