From b5a8468a8eee768ffd1d6f7df67edd5489980975 Mon Sep 17 00:00:00 2001 From: Alisa Sireneva Date: Sat, 24 Aug 2024 08:52:52 +0300 Subject: [PATCH] A note on smaller divisors --- blog/division-is-hard-but-it-does-not-have-to-be/index.html | 2 +- blog/division-is-hard-but-it-does-not-have-to-be/index.md | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/blog/division-is-hard-but-it-does-not-have-to-be/index.html b/blog/division-is-hard-but-it-does-not-have-to-be/index.html index 3e541f3..9b2cbca 100644 --- a/blog/division-is-hard-but-it-does-not-have-to-be/index.html +++ b/blog/division-is-hard-but-it-does-not-have-to-be/index.html @@ -38,7 +38,7 @@ lea rcx, [rax + 59] cmovb rax, rcx ret -

Oh, and it’s not like hard-coding 26459 was necessary. Two iterations suffice for any divisor 264232+1. Need more primes? Choose away, there’s a lot of them in the 232-long region.

And this method works for division too, not just modulo:

fn divide(mut n: u128) -> u128 {
+

Oh, and it’s not like hard-coding 26459 was necessary. Two iterations suffice for any divisor 264232+1. Need more primes? Choose away, there’s a lot of them in the 232-long region.

Need a smaller divisor? Three iterations work for n2646981461082631 (42.667 bits compared to 32 for two iterations), four for n264281472113362716 (48 bits). Sounds like a lot? That’s still better than __umodti3.

And this method works for division too, not just modulo:

fn divide(mut n: u128) -> u128 {
     let mut quotient = n >> 64;
     n = n % (1 << 64) + (n >> 64) * 59;
     quotient += n >> 64;
diff --git a/blog/division-is-hard-but-it-does-not-have-to-be/index.md b/blog/division-is-hard-but-it-does-not-have-to-be/index.md
index 32acfee..f199aef 100644
--- a/blog/division-is-hard-but-it-does-not-have-to-be/index.md
+++ b/blog/division-is-hard-but-it-does-not-have-to-be/index.md
@@ -83,6 +83,8 @@ modulo:
 
 Oh, and it's not like hard-coding $2^{64} - 59$ was necessary. Two iterations suffice for any divisor $\ge 2^{64} - 2^{32} + 1$. Need more primes? Choose away, there's a lot of them in the $2^{32}$-long region.
 
+Need a smaller divisor? Three iterations work for $n \ge 2^{64} - 6981461082631$ (42.667 bits compared to 32 for two iterations), four for $n \ge 2^{64} - 281472113362716$ (48 bits). Sounds like a lot? That's still better than `__umodti3`.
+
 And this method works for division too, not just modulo:
 
 ```rust