Initialize GMM parameters with k-means #150

andrewcsmith · 2016-10-08T06:04:18Z

The general practice seems to be to use GMM as an optimization of k-means. The initializer of GMM should therefore use k-means for the initial parameters, then GMM for fine-tuning.

AtheMathmo · 2016-10-08T07:20:02Z

Yes this is indeed standard. I had thought about implementing this before but I'm not sure how exactly it should look. Certainly I'd like GMM to be accessible without the K-Means initialization. I see two approaches:

Add some new initialization trait/enum similar to K-Means to GMM
Allow a GMM model to be created from_init_clusters.

I think that it might even be worth implementing both of these - with the second suggestion coming first. The second would let us use K-Means to initialize GMM.

And to give a little more information on the K-Means initialization. The idea (from my knowledge) is to use K-Means to determine the location of the clusters (their means) and the data that belongs to them. We then compute the covariance matrix within each cluster (similarly to what we do now for the whole data set). This lets us assume a sensible mixture of gaussians over the whole data set which can be fine tuned.

References

Simple Methods for Initializing the EM Algorithm of GMMs
Small (but englightening) SO question thread

andrewcsmith · 2016-10-15T01:25:14Z

My opinion for the API here is basically the same as it is in #153. We should create a ClusterInitializer trait that has whatever methods we need, and then RandomInitializer and KMeansInitializer structs to do the proper calculations accordingly.

andrewcsmith · 2016-10-18T06:51:07Z

Same here—this issue is also fixed by #155.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initialize GMM parameters with k-means #150

Initialize GMM parameters with k-means #150

andrewcsmith commented Oct 8, 2016

AtheMathmo commented Oct 8, 2016

andrewcsmith commented Oct 15, 2016

andrewcsmith commented Oct 18, 2016

Initialize GMM parameters with k-means #150

Initialize GMM parameters with k-means #150

Comments

andrewcsmith commented Oct 8, 2016

AtheMathmo commented Oct 8, 2016

References

andrewcsmith commented Oct 15, 2016

andrewcsmith commented Oct 18, 2016