-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First draft of filterArray #144
base: master
Are you sure you want to change the base?
Conversation
@andrewthad, here's a draft. As you can see from my comment on |
Data/Primitive/Internal/Bit.hs
Outdated
let s = ((n + wordSize - 1) `unsafeShiftR` 3) | ||
mary <- newByteArray s | ||
fillByteArray mary 0 s 0 | ||
return (MBA mary) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could refrain from filling the array here, and just use a writeBitArray
function that takes the value to write. That's probably better, actually.
-- consider going word by word through the bit array and | ||
-- using countTrailingZeroes. We could even choose | ||
-- a different strategy for each word depending on its | ||
-- popCount. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also: I wouldn't be surprised if unordered-containers
had some code or ideas we could steal here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that CTZ (or equivalently CLZ) is going to do best here. I'm suspect that changing strategies with popCount
would not be helpful because I cannot think of a strategy that performs better when most of the elements are preserved. However, when all of them corresponding to a word of bits are preserved (meaning: the word is equal to maxBound :: Word
), we could use the functions that copy a slice of the array instead. I guess we could also do this when popCount
is really high instead of just when it's 64 (or 32 depending on platform), but we'd be doing several copies instead. I wonder where the breakpoint is for this being effective.
We don't really need this general at the moment, so let's specialize to make things easier for GHC.
I still need to look at this. Sorry, been focused on other things the last few days. I'll try to look this over better (and add some benchmarks) later this week. |
else check (i + 1) count ba | ||
| otherwise | ||
= do | ||
mary <- newArray count (die "filterArray" "invalid") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If count
equals the size of the original array, we have a much better option available to us. Reuse the original array. This prevents use runArray
though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why we want runArrays
and the like.
No description provided.