-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor binning #136
Comments
Example of heap-based binning:
|
Agree completely, we need to prioritize these changes so that we can fix sga() functionality ASAP. |
I've constructed an example where your greedy algorithm won't work. ;) Given where the optimal binning is [30, 30]. |
Nice catch! However, the existing algorithm would fail here too. Another reason to rethink it. |
A nice paper on categorical binning: http://www.aaai.org/ocs/index.php/IJCAI/IJCAI-09/paper/viewFile/625/705 |
Do we really need classes for binning? Do we ever use inheritance from the base
Binning
class? I've got a feeling that a bunch of functions would be enough for all our purposes.Labelling: Instead of passing
format_str
toBinning.labels
, user should pass a function that would return the appropriate label for given category, smth likelabels = bins.labels(x, lambda i: 'label " + str(i))
thus a) getting rid of not very flexible labelling code and b) giving the user the flexibility of choosing any labelling scheme.Better greedy sorting (for instance, min-heap based).
The text was updated successfully, but these errors were encountered: