TightClusterer
The TightClusterer
class uses tight clustering to eliminate points that do not not fit well in any cluster.
Value parameters
- k0
-
the number of clusters to make
- kmin
-
the minimum number of clusters to make
- s
-
the random number stream (to vary the clusters made)
- x
-
the vectors/points to be clustered stored as rows of a matrix
Attributes
- Graph
-
- Supertypes
-
class Objecttrait Matchableclass Any
Members list
Value members
Concrete methods
Given a set of points/vectors, put them in clusters, returning the cluster assignment vector. A basic goal is to minimize the sum of the distances between points within each cluster.
Given a set of points/vectors, put them in clusters, returning the cluster assignment vector. A basic goal is to minimize the sum of the distances between points within each cluster.
Attributes
Compute the mean comembership matrix by averaging results from several subsamples.
Compute the mean comembership matrix by averaging results from several subsamples.
Attributes
Create a new random subsample.
Create a new random subsample.
Attributes
Find a the first tight and stable cluster from the top candidate clubs. To be stable, a club must have a similar club at the next level (next k value).
Find a the first tight and stable cluster from the top candidate clubs. To be stable, a club must have a similar club at the next level (next k value).
Value parameters
- topClubs
-
the top clubs for each level to be search for stable clusters
Attributes
Form candidate clusters by collecting points with high average comembership scores together in clusters (clubs).
Form candidate clusters by collecting points with high average comembership scores together in clusters (clubs).
Value parameters
- md
-
the mean comembership matrix
Attributes
Order the clubs (candidate clusters) by size, returning the rank order (largest first).
Order the clubs (candidate clusters) by size, returning the rank order (largest first).
Value parameters
- clubs
-
the candidate clusters
Attributes
Pick the top q clubs based on club size.
Pick the top q clubs based on club size.
Value parameters
- clubs
-
all the clubs (candidate clusters)
- order
-
the rank order (by club size) of all the clubs
Attributes
Select candidates for tight clusters in the K-means algorithm for a given number of clusters 'k'. This corresponds to Algorithm A in the paper/URL.
Select candidates for tight clusters in the K-means algorithm for a given number of clusters 'k'. This corresponds to Algorithm A in the paper/URL.
Value parameters
- k
-
the number of clusters
Attributes
Compute the similarity of two clubs as the ratio of the size of their intersection to their union.
Compute the similarity of two clubs as the ratio of the size of their intersection to their union.
Value parameters
- c1
-
the first club
- c2
-
the second club