You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Clusterings are very useful in data analysis to group data into subsets such that points within each cluster are more "alike" (in some sense) than points in different clusters. Let's formalise it.
:::{#def-}
A *clustering* of a set $X$ is a collection $C_1, \ldots, C_n \subset X$ such that $\cup C_i = X$ and $C_i \cap C_j = \emptyset$, for any $i, j$. In other words: it is a disjoint covering of $X$. Each $C_i$ is called a *cluster* of $X$.
A clustering method is a function $f: X \to \mathbb{N}$ that associates each point $x \in X$ with its clusters $f(x)$.
:::
There are several famous clustering methods such as [citar]. We will focus on the ToMATo algorithm, which uses tools from topology.