-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Splitmix64 #89
Add Splitmix64 #89
Conversation
@NAThompson I assume you will have some interest in this since you have asked for other PRNGs before. |
@mborland : std::random_device is a hardware TRNG, whereas splitmix64 is a PRNG right? Shouldn't the comparison be to (say) PCG32? |
You are correct. Here is a more accurate comparison of 64-bit PRNGs:
|
@mborland : For some reason these numbers seem unreasonably slow. I expect Edit: Sorry for the noise; I see that you call it 10x which is about right. |
One unit test which might be useful: Generally for PRNGs, we have requirements like this one: NotesThe Nth consecutive invocation of a default-constructed engine is required to produce the following value:
Is there an analog for splitmix64? Of course, you could introduce your own definition and simply document it. |
On Nov 27, 2022, at 23:00, Nick ***@***.***> wrote:
@NAThompson commented on this pull request.
In include/boost/random/splitmix64.hpp <#89 (comment)>:
> @@ -0,0 +1,98 @@
+/*
+ * Copyright Sebastiano Vigna 2015.
Did Sebastiano Vigna write this code?
Alert:
1) you cannot safely use up more than a cubic root of the period of any generator. This generator's period is apparently less than 2^64 but its not stated how much exatly.
A cubic root of that is ~2^21. Thus you must not use this generator in any program that will run longer than a millisecond.
In particular, you would not use it to create large numbers of streams on a multiprocessor system.
2) This generator is not from Vigna, the authors are not known to be experts on rngs, and the principle of construction
is obscure. It is neither an LCG, nor MRG, nor LFSR, nor counter based crypto. What is it exactly?
3) For industrial strength RNG on boost, use MIXMAX. It is parallelizable and splittable:
boost::random::mixmax gen{seed1,seed2,seed3,seed4};
It is enough that at least one of the 32bit seeds is different by one bit. This gives you 2^128 guaranteed non-colliding streams that pass all tests.
Cheers,
Kostas
=========================================
Institute of Nuclear and Particle Physics
NCSR Demokritos
http://inspirehep.net/author/profile/K.G.Savvidy.1
https://mixmax.hepforge.org
|
Kostas, The primary use for this is to seed the xoshiro/xoroshiro family of generators as recommended by the authors here. The original C code is copyrighted by Vigna and David Blackman for the generators. I think it would be disingenuous to say that are not experts given the large amount of literature they have published on the subject. A large number of languages use these as their standard PRNG so I think it would be useful if we offered them as well. |
On 11/28/22 14:56, Matt Borland wrote:
@mborland commented on this pull request.
+ template <typename FIter>
+ void generate(FIter first, FIter last)
It is not a standard member in the STL, but the Boost versions have this functionality so for the sake of consistency I added it.
The reason it exists is to make every generator usable as a SeedSeq,
for compatibility with ancient (pre-C++11) Boost.Random. It's not
very relevant anymore.
|
On 11/27/22 20:53, Nick wrote:
Could you test that your implementation is compatible with various distributions? i.e., pipe it through std::normal_distribution and `std::uniform_distribution` and `std::gamma_distribution`?
It's very easy for an integer RNG to not meet the conceptual requirements to map into a continuous distribution.
Please do use the generic test harness for random engines. It has some
obsolete functionality, like the iterator seed routines and the generate
function that should probably not be supported for new engines and
should be #ifdef'ed out in test_generator.ipp, but it will catch
things like the missing seed_seq constructor and the missing
seed function.
Also, the implementation of generate is wrong. generate is supposed
fill a range with 32-bit values, regardless of the both the engine's
value_type and the array's element type.
|
+1, it's Java's default PRNG which is good enough reason to offer it as an option at least. It's drawbacks should be well documented though. |
The generic test harness is used and now passes on CI. I removed two of the obsolete constructor tests you mentioned. I ended up leaving the generate function in otherwise I would have to remove all of the concepts tests. They seem valuable to retain. |
I separated this from the implementation of the other generators for ease of review. This is over 300 lines, and to include it as an implementation detail of the generators seemed unwieldy to me. |
OK, I have a better idea now of what this thing is. Steele et al wrote a paper in 2014 suggesting to split random number streams by means of "pedigree". The pedigree idea is sound, but the realization was and is flawed. Same mistake was made by the authors of PCG, namely that you might create a new generator by changing the additive constant on the fly. Java even now ships with this flawed implementation of splitting, as far as i can tell. Since the state transition function is just uint64 state += c, so the period is indeed 2^64 for odd c. Turns out not all constants are created equal and some lead to detectable flaws. Vigna wrote code based on a paper by Steele, and threw away the flawed idea of producing split streams by choosing a specific additive constant. Thats the "0x9E3779B97F4A7C15". He puts his copyright on -- justly so because he did not copy any code -- then you copy-paste and wrap it in boost, and put your own copyright onto it. Is this even kosher? The file has a Copyright at the top, even if lower down it has a Creative commons license. The license I suppose gives you the legal right to do so, but ....
Much of Steele's subsequent work on random numbers is patented and assigned to Oracle, btw. I am not thrilled about that either. But is the algorithm really by Steele et al? It is state += c, plus an output=hash(state), where the hash-like function was invented by yet another person, Austin Appleby for MurmurHash3, and David Stafford, who modified the constants in MurmurHash3, in a blog post, based on dubious criteria. On the contrary, Austin Appleby appears to be alive and well, developes his own original software which reads:
|
@kotika There is already a link to the paper on ACM from Steele et. al. in the comments at the top of the file. @jzmaddock, @swatanabe, and @NAThompson I think this is good for review. This passes CI using the existing generic test harness and concept tests with tailoring for C++11 as recommended by Steve. |
@swatanabe : @mborland is now in the employ of boost. Would you mind giving him maintainer access in order that he could merge a few of his MRs? |
Actually @swatanabe agreed to this off list, sorry been busy with other things, just give Matt write access now. |
@mborland : LGTM. Merge it. |
Adds the Splitmix64 generator from Sebastiano Vigna as described here. Will be used for the future implementation (generates the initial seed) of the xoroshiro family of PRNGs. The performance comparison of the STL and current Boost.Random random to Splitmix64 is shown below: