Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First draft of filterArray #144

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

treeowl
Copy link
Collaborator

@treeowl treeowl commented Apr 22, 2018

No description provided.

@treeowl
Copy link
Collaborator Author

treeowl commented Apr 22, 2018

@andrewthad, here's a draft. As you can see from my comment on fill, I think there's some room for improvement.

let s = ((n + wordSize - 1) `unsafeShiftR` 3)
mary <- newByteArray s
fillByteArray mary 0 s 0
return (MBA mary)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could refrain from filling the array here, and just use a writeBitArray function that takes the value to write. That's probably better, actually.

-- consider going word by word through the bit array and
-- using countTrailingZeroes. We could even choose
-- a different strategy for each word depending on its
-- popCount.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: I wouldn't be surprised if unordered-containers had some code or ideas we could steal here.

Copy link
Collaborator

@andrewthad andrewthad Apr 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that CTZ (or equivalently CLZ) is going to do best here. I'm suspect that changing strategies with popCount would not be helpful because I cannot think of a strategy that performs better when most of the elements are preserved. However, when all of them corresponding to a word of bits are preserved (meaning: the word is equal to maxBound :: Word), we could use the functions that copy a slice of the array instead. I guess we could also do this when popCount is really high instead of just when it's 64 (or 32 depending on platform), but we'd be doing several copies instead. I wonder where the breakpoint is for this being effective.

We don't really need this general at the moment, so let's specialize
to make things easier for GHC.
@andrewthad
Copy link
Collaborator

I still need to look at this. Sorry, been focused on other things the last few days. I'll try to look this over better (and add some benchmarks) later this week.

else check (i + 1) count ba
| otherwise
= do
mary <- newArray count (die "filterArray" "invalid")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If count equals the size of the original array, we have a much better option available to us. Reuse the original array. This prevents use runArray though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why we want runArrays and the like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants