Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide new generators: urandom and getentropy #250

Merged
merged 21 commits into from
Jan 8, 2025

Conversation

hannesm
Copy link
Member

@hannesm hannesm commented Nov 27, 2024

to fix #249

@hannesm
Copy link
Member Author

hannesm commented Nov 27, 2024

There's the question how to annotate Fortuna to be racy. Should there be an [@@alert]?

Provide guidance to use these by default, document that Fortuna is not
thread-safe. As suggested in mirage#249
hannesm and others added 2 commits November 29, 2024 21:09
@hannesm
Copy link
Member Author

hannesm commented Nov 29, 2024

The failure semantics is now a bit sad: if for some reason the fd that has /dev/urandom open isn't able to read, we'll end up in an exception. previously with fortuna we were sure that once it is seeded, we'll never fail...

but i guess if /dev/urandom doesn't deliver anymore we have other problems, and it's fine to raise an exception.

since the pfortuna benchmark has been moved to bench/speed, remove bench/miou

Mirage_crypto_rng_unix: remove getrandom_into
@hannesm
Copy link
Member Author

hannesm commented Dec 6, 2024

I think this is fine to merge now.

@hannesm
Copy link
Member Author

hannesm commented Jan 5, 2025

@edwintorok happy new year! It'd be great if you could have a look at this PR -- together with a release and announcement on discuss.ocaml.org, I think this would solve #249. What do you think?

@edwintorok
Copy link
Contributor

@edwintorok happy new year! It'd be great if you could have a look at this PR -- together with a release and announcement on discuss.ocaml.org, I think this would solve #249. What do you think?

Happy New Year!

I started looking at the PR, in general it looks good, but is a complex topic with many (historic) OS specific bugs.

Perhaps it might be useful to add a sanity test that fills the buffer with a poison value, calls the RNG, and then checks that (most) bytes have been changed.
Although it is difficult to tell what a good threshold would be if we do this 256 times with each possible poison byte, then during at least one run we should detect that a cell has been changed.

I can try to submit such a test independently from this PR.

@hannesm
Copy link
Member Author

hannesm commented Jan 6, 2025

Dear @edwintorok thanks a lot for your review comments. I'll address them this week and likely cut a release thereafter.

@edwintorok
Copy link
Contributor

edwintorok commented Jan 6, 2025

BTW I started writing some RNG tests last year: https://github.com/perf101/rage/tree/private/edvint/stats6/test/rngtest
It was for a different purpose, I was looking for good quality uniform randomness, but that is a minimum requirement for cryptographic RNGs (of course non-cryptographic RNGs, like the one used by OCaml in 5.x are much faster, and actually have very good randomness properties, so that is what I picked for that particular project).

Although running the full suite takes several hours (and there are multiple testsuites where experts disagree which one is better, so I used them all: dieharder, TestU01, and practrand. And due to way tests work a "true" RNG is expected to fail these tests some of the time (in fact exactly as many times as its p value says, which can't be zero because then it won't reject bad RNGs either), and you can have a meta test that then looks at the distribution of these failures, but that needs to have a p-value to fail, and so on, so using this in a CI might be tricky, but a suitable p-value and a suitable number of repetitions for the "meta" test might result in something usable in a CI.
It should be possible to have a cut down version of those tests that finish in a couple of minutes that could be used as a sanity test in the CI, I'll see whether I can put together a PR for that (I currently lack the 'meta' testing for some of the tests).

One discovery I made during those tests was that calling RDRAND in parallel from lots of cores caused some of them to fail repeatedly, and IIRC RDSEED didn't have that problem. Not relevant for this PR, but perhaps something to keep in mind if you boot lots of mirage unikernels all at the same time and they generate a lot of random data.

@hannesm
Copy link
Member Author

hannesm commented Jan 7, 2025

CI failures:

OCaml-CI:

  • openbsd: some core_unix/base_unix issues, won't fix
  • debian 5.3 beta 1: async_rpc_kernel compilation issue, won't fix
  • windows 5.3:
File "tests/dune", line 72, characters 26-54:
72 |  (names     test_miou_rng test_miou_entropy_collection)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(cd _build/default/tests && .\test_miou_entropy_collection.exe)
Fatal error: exception Unix.Unix_error(Unix.ENOTSOCK, "caml_unix_set_nonblock", "")
File "tests/dune", line 72, characters 12-25:
72 |  (names     test_miou_rng test_miou_entropy_collection)
^^^^^^^^^^^^^
(cd _build/default/tests && .\test_miou_rng.exe)
Fatal error: exception Unix.Unix_error(Unix.ENOTSOCK, "caml_unix_set_nonblock", "")

So, some issue with miou and windows.

The github windows action: issues with gmp.

TL;DR: CI errors are unrelated to the proposed change.

The uint32_t -> size_t change awaits feedback from Edwin. I think otherwise the PR is fine to merge.

@edwintorok
Copy link
Contributor

The uint32_t -> size_t change awaits feedback from Edwin

There is one missing piece there: Int_val also needs to be changed to Long_val, see the latest suggested change on github.

If you make that change I think this PR is ready then, thanks a lot for the improvements.

@hannesm
Copy link
Member Author

hannesm commented Jan 8, 2025

Ok, I like the diff. One remaining question is whether we should immediately deprecate the Mirage_crypto_rng_unix.initialize, Mirage_crypto_rng_lwt.initialize, Mirage_crypto_rng_async.initialize, Mirage_crypto_rng_miou.initialize (also Mirage_crypto_rng_miou_unix.iniitialize) and point to Mirage_crypto_rng_unix.use_default ()?

Maybe @dinosaure and @reynir have ideas about that. It also doesn't block merging this PR, only blocks a subsequent release. It'd ease the opam package cone as well (since in the end we could remove these packages, and only retain mirage-crypto-rng and mirage-crypto-rng-mirage).

@hannesm hannesm merged commit c52a56a into mirage:main Jan 8, 2025
7 of 9 checks passed
@hannesm hannesm deleted the use-getrandom-into branch January 8, 2025 13:01
hannesm added a commit to hannesm/opam-repository that referenced this pull request Jan 30, 2025
CHANGES:

Provide thread safety (Unix.fork and multi-domain safe) RNG generators by using
getrandom/getentropy on UNIX (or /dev/urandom). In your UNIX applications,
please use the "mirage-crypto-rng.unix" dependency and call
"Mirage_crypto_rng_unix.use_default ()" (instead of depending on
mirage-crypto-rng-{lwt,eio,async} and calling
"Mirage_crypto_rng_{eio,lwt,async}.initialize".

* mirage-crypto-rng: handle CPU_RNG failures (mirage/mirage-crypto#255 @hannesm, addresses mirage/mirage-crypto#251 mirage/mirage-crypto#252
  mirage/mirage-crypto#253)
* mirage-crypto-rng.unix: provide two generators: Urandom and Getentropy
  (mirage/mirage-crypto#250 @hannesm @reynir @edwintorok, addresses mirage/mirage-crypto#249)
* mirage-crypto-rng: deprecate the initialize for lwt, async, eio (and
  advertise `Mirage_crypto_rng_unix.use_default ()` (mirage/mirage-crypto#254 @hannesm)
* mirage-crypto-rng-eio: declare the cstruct dependency (mirage/mirage-crypto#247 @hannesm)
* include "windows.h" (all lowercase) (mirage/mirage-crypto#248 @mefyl)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

mirage-crypto-rng.unix thread safety?
4 participants