From 935b83ee16db8b9c090791817045351f0715ab51 Mon Sep 17 00:00:00 2001 From: Sam Thorogood Date: Tue, 28 Apr 2020 15:31:09 +1000 Subject: [PATCH] clarify ratio --- bench/README.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/bench/README.md b/bench/README.md index b337c71..ba3537d 100644 --- a/bench/README.md +++ b/bench/README.md @@ -10,11 +10,12 @@ Usage: If you don't provide a source file, or specify a length instead, this will generate actual random text in JavaScript. For a better test, use suggested UTF-8 encoded source text from [Project Gutenberg](https://www.gutenberg.org/files/23841/23841-0.txt). -This has a ratio of "bytes-to-length" of 0.35. +The linked file has a ratio of "bytes-to-length" of 0.35. -This is an odd number, but we're comparing the on-disk UTF-8 bytes (which optimize for ASCII and other low Unicode values) to the length of JavaScript's UCS-2 / UTF-16 internal representation. +This ratio is an odd number. +It compares the on-disk UTF-8 bytes (which optimize for ASCII and other low Unicode values) to the length of JavaScript's UCS-2 / UTF-16 internal representation. All Unicode code points can be represented as either one or two "lengths" of a JavaScript string, but each code point can be between 1-4 bytes in UTF-8. -The possible ratios therefore range from 0.25 (e.g., all emoji) through 1.0 (e.g., ASCII). +The valid ratios therefore range from ⅓ through 1.0 (e.g., ASCII). # Options