-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement kmeans with kmeans++ #24
Conversation
a935134
to
5dba760
Compare
8683b22
to
8c2cfc8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add tests and remove CI/CD stuff (that is subject to a separate PR). Also, move the k-means algorithm into a separate file.
@automainint What do you think about moving files like |
Update read_mutual_scores
5cd688c
to
b21e138
Compare
I usually avoid adding abstractions without a specific reason. E.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each function that returns a score should also return the cluster value for that score. So we have to change function signatures, and it's probably better to apply (or read cached) clusters in functions fetch_score
and fetch_all_scores
.
Currently read_mutual_scores
doesn't apply clustering correctly. All scores from a user node should be clustered, not all scores to a user node.
I will change function signatures and write tests in the new branch feature/score_clustering.
Also we will have to implement caching of clusters as "group bounds", as mentioned in #10 (comment) by @ichorid.
No description provided.