Skip to content

hitflame/agglomerative-hierarchical-clustering

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agglomerative Hierarchical Clustering

Implements the Agglomerative Hierarchical Clustering algorithm.

Usage

To run the clustering program, you need to supply the following parameters on the command line:

  • Input file that contains the items to be clustered.

  • Number of disjointed clusters that we wish to extract.

  • Linkage criteria to use when calculating the distance metric.

    • s - Single linkage (default)
    • c - Complete linkage
    • a - Average linkage
    • t - Centroid linkage

For instance, the following is an example run:

$ ./agglomerate example.txt 3 s

In this example, we are running the hierarchical agglomerative clustering on the items in the input file example.txt. We are asking the program to generate 3 disjointed clusters using the single-linkage distance metric.

The input file

The input file contains the items to be clustered.

<number of items to cluster>
<label string>| <x-axis value> <y-axis value>
...

For instance, the following is a valid input. It contains 12 data points, where each data point is referred to by its label and has coordinates in the two-dimensional Euclidean plane.

12
A| 1.0 1.0
B| 2.0 1.0
C| 2.0 2.0
D| 4.0 5.0
E| 5.0 4.0
F| 5.0 5.0
G| 5.0 6.0
H| 6.0 5.0
I| 9.0 9.0
J| 10.0 9.0
K| 10.0 10.0
L| 11.0 9.0

After running the clustering algorithm, we get the following hierarchy:

Example agglomerative hierarchical clustering

The cluster hierarchy may be represented by the binary tree:

Example clustering as a binary tree

For further details, please visit my [homepage](http://yaikhom.com/2014/08/21/ agglomerative-hierarchical-clustering.html).

About

Implements the Agglomerative Hierarchical Clustering algorithm.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C 99.6%
  • Makefile 0.4%