Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in merge function #14

Open
feldroop opened this issue Jan 9, 2021 · 1 comment
Open

Bug in merge function #14

feldroop opened this issue Jan 9, 2021 · 1 comment

Comments

@feldroop
Copy link

feldroop commented Jan 9, 2021

I think in the function hll::HyperLogLog::merge it should not be the bitwise or between the two registers:

if (M_[r] < other.M_[r]) {
    M_[r] |= other.M_[r];
}

Instead it should be a simple assignment:

if (M_[r] < other.M_[r]) {
    M_[r] = other.M_[r];
}

In my application I use the HyperLogLog sketches mainly to estimate merged unions. I noticed that sometimes the estimate is too high and the relative error skyrockets. I found that when I exclusively use hll::HyperLogLog::add the estimate is good.

When I changed the above snippet, the value from the merging routine was exactly the same as the one from just adding. With my understanding of the HyperLogLog algorithm I also think this is theoretically the correct thing to do.

I don't know about the HyperLogLogHIP, because I have not yet read the article about that estimator. But it also doesn't give good estimates when I do a lot of merging.

@viktorika
Copy link

I strongly agree,it must be fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants