Skip to content
This repository has been archived by the owner on Jul 7, 2020. It is now read-only.

CountMinSketch overflow check causes a lot of allocations #141

Open
alexandrnikitin opened this issue Aug 22, 2017 · 4 comments
Open

CountMinSketch overflow check causes a lot of allocations #141

alexandrnikitin opened this issue Aug 22, 2017 · 4 comments

Comments

@alexandrnikitin
Copy link

It starts from String.valueOf(item) here

checkSizeAfterAdd(String.valueOf(item), count);
String concatenation here
checkSizeAfterOperation(previousSize, "add(" + item + "," + count + ")", size);
and here
checkSizeAfterOperation(previousSize, "merge(" + estimator + ")", size);

Could be related to #140

@abramsm
Copy link
Contributor

abramsm commented Aug 23, 2017

I'm not sure. Stream-lib's CountMinSketch has fallen behind what @tdunning has in his repo. Hopefully we can work on pulling in a more recent version. Otherwise we may need to deprecate this class.

@tdunning
Copy link
Contributor

I don't think I have a CMS implementation.

I have t-digest and FloatHistogram. Just had a release last week and some big stuff coming soon beyond that.

@abramsm
Copy link
Contributor

abramsm commented Aug 24, 2017

Sorry @tdunning. My memory seems to be failing me... Great news on new releases of T-digest and FloatHistogram.

@alexandrnikitin I'll look into this and #140 as soon as I can.

@alexandrnikitin
Copy link
Author

@abramsm Thank you. I'm ready to help with it. Basically I would just remove the overflow check. I can also introduce microbenchmarks to prevent regression. I just need an answer for #140 whether there's something I miss or not? I see it only as an optimization for fast comparison.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants