You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jul 16, 2021. It is now read-only.
The general practice seems to be to use GMM as an optimization of k-means. The initializer of GMM should therefore use k-means for the initial parameters, then GMM for fine-tuning.
The text was updated successfully, but these errors were encountered:
Yes this is indeed standard. I had thought about implementing this before but I'm not sure how exactly it should look. Certainly I'd like GMM to be accessible without the K-Means initialization. I see two approaches:
Add some new initialization trait/enum similar to K-Means to GMM
Allow a GMM model to be created from_init_clusters.
I think that it might even be worth implementing both of these - with the second suggestion coming first. The second would let us use K-Means to initialize GMM.
And to give a little more information on the K-Means initialization. The idea (from my knowledge) is to use K-Means to determine the location of the clusters (their means) and the data that belongs to them. We then compute the covariance matrix within each cluster (similarly to what we do now for the whole data set). This lets us assume a sensible mixture of gaussians over the whole data set which can be fine tuned.
My opinion for the API here is basically the same as it is in #153. We should create a ClusterInitializer trait that has whatever methods we need, and then RandomInitializer and KMeansInitializer structs to do the proper calculations accordingly.
The general practice seems to be to use GMM as an optimization of k-means. The initializer of GMM should therefore use k-means for the initial parameters, then GMM for fine-tuning.
The text was updated successfully, but these errors were encountered: