MachineLearningRelatedCode

This repository contains all machine learning related code I wrote that is somehow worth saving.

K-Means Clustering Analysis

The K-mean_and_related_evaluation file provides an implementation of the K-Means clustering algorithm, a popular method to partition a dataset into K distinct, non-overlapping clusters. In addition to the core algorithm, we also provide utilities to evaluate the quality of the clustering, employing methods such as Simplified Silhouette Coefficient and Within-Cluster Sum of Squares (WSS).

Features

K-Means Clustering: Partition your data into a specified number of clusters.
Simplified Silhouette Coefficient: Evaluate the consistency within clusters of data, judging the relative closeness of the data points in different clusters.
Within-Cluster Sum of Squares (WSS): Assess the homogeneity within a single cluster, quantifying how close the data points are within the same cluster.

Evaluation Metrics

Simplified Silhouette Coefficient

This metric calculates the mean intra-cluster distance between each point and the other points in the same cluster. It compares it with the mean nearest-cluster distance - that is, for each sample, the average distance from the other clusters to which the point is not assigned. The coefficient can take values between -1 and 1, where a high value indicates that the object is well matched within its own cluster and poorly matched to neighboring clusters. Since in simplified silhouette we sum the silhouette for every data point, it will take values between -#_of_data and #_of_data.

Within-Cluster Sum of Squares (WSS)

WSS measures the compactness of the clustering and we want it to be as small as possible. The idea is to minimize the sum of the squared distances between each member of the cluster and its centroid.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
K_mean_and_related_evaluation.ipynb		K_mean_and_related_evaluation.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MachineLearningRelatedCode

Table of Contents

K-Means Clustering Analysis

Features

Evaluation Metrics

Simplified Silhouette Coefficient

Within-Cluster Sum of Squares (WSS)

About

Releases

Packages

Languages

heib6xinyu/MachineLearningRelatedCode

Folders and files

Latest commit

History

Repository files navigation

MachineLearningRelatedCode

Table of Contents

K-Means Clustering Analysis

Features

Evaluation Metrics

Simplified Silhouette Coefficient

Within-Cluster Sum of Squares (WSS)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages