KMeansClustererPP
The KMeansClustererPP
class cluster several vectors/points using the Hartigan-Wong algorithm.
Value parameters
- flags
-
the flags used to adjust the algorithm
- k
-
the number of clusters to make
- x
-
the vectors/points to be clustered stored as rows of a matrix
Attributes
- Graph
-
- Supertypes
-
class KMeansClustererHWclass KMeansClusterertrait Clustererclass Objecttrait Matchableclass AnyShow all
Members list
Value members
Concrete methods
Initialize the centroids according to the k-means++ technique.
Initialize the centroids according to the k-means++ technique.
Attributes
- Definition Classes
Update the probability mass function (pmf) used for picking the next centroid. The farther 'x_i' is from any existing centroid, the higher its probability. Return the corresponding distance-derived random variate generator.
Update the probability mass function (pmf) used for picking the next centroid. The farther 'x_i' is from any existing centroid, the higher its probability. Return the corresponding distance-derived random variate generator.
Value parameters
- c
-
the current centroid index
Attributes
Inherited methods
Calculate the centroids based on current assignment of points to clusters and update the 'cent' matrix that stores the centroids in its rows.
Calculate the centroids based on current assignment of points to clusters and update the 'cent' matrix that stores the centroids in its rows.
Value parameters
- cent
-
the matrix holding the centroids in its rows
- sz
-
the sizes of the clusters (number of points)
- to_c
-
the cluster assignment array
- x
-
the data matrix holding the points {x_i = x(i)} in its rows
Attributes
- Inherited from:
- Clusterer
Return the centroids. Should only be called after train
.
Check to see if the sum of squared errors is optimum.
Check to see if the sum of squared errors is optimum.
Value parameters
- opt
-
the known (from human/oracle) optimum
- to_c
-
the cluster assignments
- x
-
the data matrix holding the points
Attributes
- Inherited from:
- Clusterer
Given a new point/vector z, determine which cluster it belongs to, i.e., the cluster whose centroid it is closest to.
Given a new point/vector z, determine which cluster it belongs to, i.e., the cluster whose centroid it is closest to.
Value parameters
- z
-
the vector to classify
Attributes
- Inherited from:
- KMeansClusterer
Return the cluster assignment vector. Should only be called after train
.
Return the cluster assignment vector. Should only be called after train
.
Attributes
- Inherited from:
- KMeansClusterer
Return the sizes of the centroids. Should only be called after train
.
Return the sizes of the centroids. Should only be called after train
.
Attributes
- Inherited from:
- KMeansClusterer
Compute the distances between vector/point 'u' and the points stored as rows in matrix 'cn'
Compute the distances between vector/point 'u' and the points stored as rows in matrix 'cn'
Value parameters
- cn
-
the matrix holding several centroids
- kc_
-
the number of centroids so far
- u
-
the given vector/point (u = x_i)
Attributes
- Inherited from:
- Clusterer
Compute the adjusted distance to point 'u' according to the R2 value described in the Hartigan-Wong algorithm.
Compute the adjusted distance to point 'u' according to the R2 value described in the Hartigan-Wong algorithm.
Value parameters
- cc
-
the current cluster for point u
- cent
-
the matrix holding the centroids
- u
-
the point in question
Attributes
- Inherited from:
- KMeansClustererHW
Return the name of the 'c'-th cluster.
Return the name of the 'c'-th cluster.
Value parameters
- c
-
the c-th cluster
Attributes
- Inherited from:
- Clusterer
Set the names for the clusters.
Set the names for the clusters.
Value parameters
- nm
-
the array of names
Attributes
- Inherited from:
- Clusterer
Set the random stream to 's'. Method must be called in implemeting classes before creating any random generators.
Set the random stream to 's'. Method must be called in implemeting classes before creating any random generators.
Value parameters
- s
-
the new value for the random number stream
Attributes
- Inherited from:
- Clusterer
Show the state of the algorithm at iteration l.
Show the state of the algorithm at iteration l.
Value parameters
- l
-
the current iteration
Attributes
- Inherited from:
- KMeansClusterer
Compute the sum of squared errors from the points in cluster 'c' to the cluster's centroid.
Compute the sum of squared errors from the points in cluster 'c' to the cluster's centroid.
Value parameters
- c
-
the current cluster
- to_c
-
the cluster assignments
- x
-
the data matrix holding the points
Attributes
- Inherited from:
- Clusterer
Compute the sum of squared errors within all clusters, where error is indicated by e.g., the distance from a point to its centroid.
Compute the sum of squared errors within all clusters, where error is indicated by e.g., the distance from a point to its centroid.
Value parameters
- to_c
-
the cluster assignments
- x
-
the data matrix holding the points
Attributes
- Inherited from:
- Clusterer
Compute the sum of squares total for all the points from the mean.
Compute the sum of squares total for all the points from the mean.
Value parameters
- x
-
the data matrix holding the points
Attributes
- Inherited from:
- Clusterer
Iteratively recompute clusters until the assignment of points does not change. Initialize by randomly assigning points to k clusters.
Iteratively recompute clusters until the assignment of points does not change. Initialize by randomly assigning points to k clusters.
Attributes
- Inherited from:
- KMeansClusterer