Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve: normal random skew towards means #1672

Merged
merged 7 commits into from
Nov 13, 2023

Conversation

luanluciano93
Copy link
Contributor

Description

The current code for normal distribution is skewed and tends to generate much more times the central element than the rest of the distribution, more than is expected of a normal distribution.

Comparison of results generating 2^27 random values between 0 and 20

Normal random:
0 * (800928)
1 **** (2126010)
2 ***** (2982546)
3 ******* (4027064)
4 ********* (5222481)
5 ************ (6495756)
6 ************** (7770920)
7 ***************** (8940450)
8 ****************** (9872740)
9 ******************* (10476202)
10 ******************************** (16807803)
11 ******************* (10474301)
12 ****************** (9872670)
13 ***************** (8924956)
14 ************** (7778402)
15 ************ (6496375)
16 ********* (5210797)
17 ******* (4026227)
18 ***** (2983148)
19 **** (2126458)
20 * (801494)

Fixed normal random:
0 * (838123)
1 **** (2227927)
2 ***** (3127189)
3 ******** (4214630)
4 ********** (5465489)
5 ************ (6800608)
6 *************** (8147614)
7 ***************** (9362439)
8 ******************* (10342027)
9 ******************** (10982638)
10 ********************* (11204545)
11 ******************** (10982142)
12 ******************* (10342200)
13 ***************** (9359764)
14 *************** (8140484)
15 ************ (6804640)
16 ********** (5465333)
17 ******** (4215876)
18 ***** (3128459)
19 **** (2226383)
20 * (839218)

This happens because of the following path:
image

This causes values below or above two standard deviations (stdev is 0.25) from the mean (0.5) to simply be replaced by the central element. In normal distributions, about 5% of the results will be farther than two standard deviations from the mean, and those would all generate the central element instead.

image

The proper solution is to just discard the result and generate a new value. The while loop will only run once in 95% of the calls. There is an infinitely small chance of an infinite loop.

The new code is also roughly 30% faster (on a Ryzen 7 4800H chip) due to it performing less branching and having only one return.

credits: @ranisalt

@luan luan changed the title improve: fix normal random skew towards mean improve: normal random skew towards mean Nov 13, 2023
@dudantas dudantas merged commit 61464f9 into opentibiabr:main Nov 13, 2023
25 checks passed
@dudantas dudantas changed the title improve: normal random skew towards mean improve: normal random skew towards means Nov 13, 2023
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

0.0% 0.0% Coverage
0.0% 0.0% Duplication

@luanluciano93 luanluciano93 deleted the fix-normal-random branch November 15, 2023 03:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants