-
Notifications
You must be signed in to change notification settings - Fork 38
Basic Theory and Facts
Arbitrary-precision arithmetic requires some basic theory to understand. This page aims to give you enough to get started with working on Ramp itself. It is not required to use the library, as these details are hidden from the user.
There are also some facts that are generally useful for working on Ramp.
Before we start, I will explain some notation and conventions that will be used here and other pages. This is simply to make talking about these topics easier. When talking about numbers, a subscript will indicate the base the number is in, for example 195010 is in base-10 (decimal), while A0B16 is in base-16 (hexadecimal). If you don’t understand what I mean by "base", don’t worry, that will be explained shortly. If no base is shown, then it can be assumed to be 10.
It is useful to talk about numbers as sequences of digits, they will be written as: (xn, xn-1, …, x1)b, indicating an n-digit number in base b.
When we write down numbers, we are actually using a specific notation to represent that number. The numeral system we are most used to is called a "positional numeral system" due to it representing the magnitude of a digit by it’s position in the number. Positional numeral systems require a "base", also known as a radix, to work from though. The most commonly used base is 10, also known decimal, however base-2 (binary), base-8 (octal) and base-16 (hexadecimal) are commonly used in programming.
The number 195010 represents the result of 1 * 103 + 9 * 102 + 5 * 101 + 0 x 100. Each digit in the number is multiplied by the base taken to the power of it’s position in the digit. Note that for a given base b, each digit can take on b values.
Ramp uses a base dependent on the target platform. Specifically, it uses a base equal to the number of unique values you can store in a machine word. In practice this means that Ramp uses a base of 232 on a 32-bit platform and a base of 264 on a 64-bit platform, this base will be normally be noted as "B". To distinguish the incredibly large base Ramp uses from regular bases, Ramp uses the term "Limb" to refer to a single base-B digit.
Usefully, you can use the number of digits in a number in base-b to approximate the logb of the number. Importantly, the number of digits, n in a number N is strictly greater than logb(N). This can be shown be observing that logb(bn) = n, by definition, and that the largest number representable in an n-digits is bn - 1. As bn > bn - 1, it follows that logb(bn) > logb(bn - 1) >= logb(N) and thus n > logb(N). The approximation is also bounded by n ⇐ logb(N) + 1, this can be shown by observing that the smallest number that requires n digits is bn-1. Thus, logb(bn-1) = n - 1 and therefore logb(bn-1) ⇐ logb(N) → n - 1 ⇐ logb(N) and finally, n ⇐ logb(N) + 1.
For the primary arithmetic operations, the number of digits in the result can be approximated based on the number of digits in the inputs. These sizes are upper bounds, however, and the real result may be much smaller.
Addition: Adding an m-digit number to an n-digit number will produce at most max(m, n) + 1 digits in the result. This is due to how numbers are represented in positional numeral systems: for any base, b, the largest value a single digit can represent is b - 1, thus when adding two digits together, the largest possible result is 2b - 2. This no longer fits in a single digit, instead becoming (1, b - 2)b (converting this back gives us 1 * b1 + (b - 2) * b0 = b + (b - 2) = 2b - 2, or the answer we started with). Extending this out to multiple digits, we can see that adding two numbers by adding the matching digits plus the extra digit from the previous pair will produce a maximum value of b - 1 in that place and no more than 1 in the next place.
Subtraction: Assuming two natural numbers, then the result of subtracting an m-digit, M, number from an n-digit number, N, where M < N, will produce a result with at most n digits. This is clear from the definition, as the result is positive, but must be smaller than N and therefore representable in n or fewer digits.
Multiplication: Multiplying an m-digit number by an n-digit number will produce at most m + n digits in the result. This simplest way to show this is to use the number of digits in the number as an approximation to the logb of the number. Using the identity that logb(x * y) = logb(x) + logb(y) we can see that the maximum number of digits in the result is m + n.
Division: Dividing an m-digit number, M, by an n-digit number, N, where M >= N produces a result with at most (m - n) + 1 digits. To see why, we can use the approximation of the logb of the number and the identity logb(x/y) = logb(x) - logb(y). As the number of digits is strictly greater than the logb of the number, the approximation may under-estimate the number of digits. As the largest error comes from the error of logb(N) being at it’s maximum and the error of logb(M) being at it’s minimum, we can see that the maximum error of our approximation is less than or equal to 1. From above, the maximum of approx(logb(N)) is logb(N) + 1, while the minimum of approx(logb(M)) is logb(M). Thus, the maximum error is (logb(M) - logb(N)) - (logb(M) - logb(N) + 1) which simplifies to -1, so we need to add an extra 1 to our approximation to ensure that we never under-estimate the number of digits.