-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New functionality that supports a more straight-forward and technically justifiable entropy bound argument #93
Open
joshuaehill
wants to merge
3
commits into
smuellerDD:master
Choose a base branch
from
joshuaehill:MemOnlyPR
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. Separation of the memory timing and overall timing noise sources. These have always been presented as conceptually distinct but were combined within any analysis because the raw data reflected both. 2. Addition of some new functionality that enables a more conservative mode that provides a higher assurance of a specific security level. These changes essentially support a more straight-forward and technically justifiable entropy bound argument. 3. General clean up and "cruft" removal. Changes here include the removal of optional functionality that seems to be present for historical reasons, but either can't be used any longer or are non-default functionality that probably ought not be used. 1. Noise source: - Make the timing of the `jent_memaccess()` function the primary noise source. - This source's distribution is easier to analyze than the overall timing. - Overall timing (including the hashing operation) is treated as an additional noise source. - All health testing is on this data, and the raw data output for testing is now this timing. - Allow a raw noise source value range to be selected so that a specified sub-distribution is uniformly used. - Only data within this range is counted as being output from the primary noise source. - All sampled data is sent into the conditioning function. - Decimated data and data from different sub-distributions is treated as supplemental data. - Probabilistically decimate the data to account for statistical memory depth (including a pseudo-random component of delay to increase independence). - Use the "volatile" keyword in a few places to prevent the compiler from reordering of the timing and memory updates within this source. - Use compiler intrinsic for IA64 TSC access: - Change to rdtscp and _mm_lfence() - Pseudo-random values used to establish the level of decimation are taken from a PRNG, not an ad hoc method. - Move to xoshiro256** so that this mechanism is usable by jent_loop_shuffle. - Simplify `mem` allocation: - Track the size of `mem` through its exponent (`2^memsize_exp` is allocated). - Make it possible to compile in a specific memory size request using the `JENT_MEMORY_SIZE_EXP` macro. - This compiled in default can be overridden by a flag. - Treat `JENT_MAX_MEMSIZE_*` macros as maximums and `JENT_MEMSIZE_*` as memory size requests. - The `JENT_MAX_MEMSIZE_*` flags are now handled as a fixed maximum size. - Previously, this value could be increased automatically in `jent_update_memsize_exp()`. - This increase wouldn't consistently cause an increased allocation. - Such increases now change the requested memory through use of a different `JENT_MEMSIZE_*` flag. - Remove 32kB flag (it seems unlikely that any modern CPU isn't going to cache such a region!) and add a 1GB flag. - Cause `jent_cache_size()` to return the cache size, rather than `jent_cache_size_roundup`, which returned the nearest power of 2 above the cache size. - By default (in the absence of guidance in flags or a compiled-in macro) allocate approximately 8*cache_size for the jent_memaccess() noise source. - This size forces most memory updates to resolve to RAM I/O. - This is consistent with the Worked Example presented. 2. Conditioning: - Make it more clear what data is primary noise source output vs additional noise source output vs supplemental data. - Split `jent_hash_time()` into two functions - `jent_hash_time()`: Processes the raw data from the primary noise source. - Data that is decimated and data from other sub-distributions is treated as supplemental data. - `jent_hash_additional()`: Processes the data from an additional noise source and supplemental data. - Remove ability to disable PRNG use. - The primary raw noise source data is not dominated by the result of PRNG output, so fixing this output is not necessary for analysis. - Create data using the PRNG rather than internal state for hashing in `jent_hash_additional()`. - The timing of this hash contributes to the additional noise source. - The result of the hash (the `intermediary` buffer) is fed into the conditioning function (as before) but now essentially provides a nonce. 3. Health tests: - Make a new health test to check if the specified sub-distribution has become too infrequent. 4. Testing scripts and tools: - jitterentropy-hashtime: - Remove extra tests (ec_min). - Store and output smaller binary data types - This reduces the memory impact in testing. - Print configuration and performance information. - jitterentropy-rng: - Print configuration and performance information. - Added a few scripts for testing. 5. Documentation: - Update testing/raw-entropy/README.md to provide a detailed rationale and assessment Worked Example.
Am Donnerstag, 15. September 2022, 03:04:33 CEST schrieb Joshua E. Hill:
Hi Joshua,
I am delighted to see this excellent update! Thanks a lot.
I have to go through the implementation in detail before pushing it.
Yet, one change is a bit difficult: the use of the intrinsic header file. This
header file is not a default in POSIX environments. The jitterentropy-base-
user.h is intended for common POSIX environments.
We can surely add this with an ifdef though.
Ciao
Stephan
|
On the intrinsic function use, it is surely possible to do the same with inline assembly, I'm just not conversant in that area. I was just trying to get my compiler to produce the instructions that I wanted for testing with minimal fuss. 😅 |
…mmented on an assessment approach for the prior JEnt behavior. Applied changes in response to comments from @swkeypair and @hlarsen-os.
…stated per 10000 (the Distribution health test window size) rather than per 1000. Corrected calculation to better attain the stated false positive rate.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Summary
We have a proposed set of changes to the JEnt library; they essentially can be broken into a few types of changes:
Pull Request Rationale
This set of changes increases both the level of security and assurance over the base JEnt behavior.
The overall timing of the entropy source behavior no longer influences the raw output of the primary noise source (Note: this timing is still included as additional noise source output, but is not credited as contributing entropy). This means that the (quite complicated) OS and other system behavior no longer leads to a family of overlapping of sub-distributions in the noise source data. It is difficult to meaningfully bound the min entropy of data from systems displaying such radically diverse behavior, and it seems likely that an active attacker could influence the entropy source so that a selected sub-distribution became dominant. With this new behavior there is no longer any need to fully characterize the (likely very complicated) complete emergent system behavior.
With this new JEnt behavior the underlying variation that is being assessed is essentially governed by a small amount of hardware. As such, unrelated software changes are unlikely to significantly alter this behavior, so an assessment on a system using this new behavior is expected to be more durable than an assessment of the base JEnt behavior. In comparison, with the base JEnt behavior, the observed emergent behavior could be significantly impacted by apparently unrelated software or loading changes.
Even the relatively simple memory I/O timing used by the primary noise source in the new JEnt behavior can resolve in a number of ways (predominantly L1 cache, L2 cache, L3 cache, and RAM I/O), each with their own sub-distribution. In the updated JEnt behavior we are able to identify the specific sub-distribution that we are interested in and configure the noise source so that it only outputs values from this sub-distribution. Once we have this relatively simpler timing distribution, health testing becomes dramatically more powerful and statistical assessment becomes dramatically more meaningful. This approach may also reduce the need for data translation: in the Worked Example, no translation was required to reduce the symbol size. With the base JEnt behavior the tester is directed to analyze the lower byte of the data, essentially superimposing possibly dozens of distinct sub-distributions (each of which resulted from the timing differences experienced during different sets of events, each with their own associated timing distributions!) Such translation approaches are unlikely to yield meaningful assessments when using the SP 800-90B estimators, as consecutive raw data symbols may have been drawn from completely different sub-distributions, each of which may have quite different behavior.
The new JEnt behavior is much more conservative than can be configured in the base JEnt package. In the Worked Example, the setting
$$osr \geq \left\lceil \frac{3071.5}{3.5} \right\rceil = 878$$
JENT_MEMORY_DEPTH_EXP = 11
leads to outputting one raw noise sample from the primary noise source per 3071.5 raw symbols sampled (on average), and for the Worked Example the entropy analysis supports the notion that there are 3.5 bits of min entropy per decimated output. This suggests that an undecimated data stream could instead be used so long as(as the Worked Example analysis supports an average claim of 1 bit of min entropy per 878 undecimated raw symbols). The base JEnt library does not support
osr
settings greater than 20.The new JEnt behavior provides a library that is at least as good as the base JEnt behavior. The overall timing is still integrated into the conditioning function, it just isn't credited as providing entropy (it is considered output from an additional noise source). All data (including decimated data and data that is not from the identified sub-distribution) is similarly integrated as supplemental data provided to the conditioning function, but not credited as providing any entropy.
The underlying assumption of attacker non-predictability required to formally justify a particular security level with the new JEnt behavior applies in more circumstances than the corresponding assumption required for the base JEnt behavior. The assumption underlying any specific entropy claim for a software noise source of similar design is this: the variation that was characterized does not have any unexpected patterns and is unpredictable to an attacker. More specifically:
In the new JEnt behavior we have tried to limit the behavior that we are characterizing so that it is as much as possible established only by RAM I/O timings, a characteristic that has been observed as being unpredictable in many systems. Conversely, the raw data output from the base JEnt library is impacted by a wide variety of higher-level emergent timing behaviors, any of which could become dominant, and any of which may ultimately be more predictable than anticipated for a suitably well-informed active attacker.
We encourage anyone interested in this approach to review Worked Example.
Pull Request Changes
jent_memaccess()
function the primary noise source.mem
allocation:mem
through its exponent (2^memsize_exp
is allocated).JENT_MEMORY_SIZE_EXP
macro.JENT_MAX_MEMSIZE_*
macros as maximums andJENT_MEMSIZE_*
as memory size requestsJENT_MAX_MEMSIZE_*
flags are now handled as a fixed maximum size.jent_update_memsize_exp()
.JENT_MEMSIZE_*
flag.jent_cache_size()
to return the cache size, rather thanjent_cache_size_roundup
, which returned the nearest power of 2 above the cache size.jent_hash_time()
into two functionsjent_hash_time()
: Processes the raw data from the primary noise source.jent_hash_additional()
: Processes the data from an additional noise source and supplemental data.jent_hash_additional()
.intermediary
buffer) is fed into the conditioning function (as before) but now essentially provides a nonce.