class KMeansPPClusterer extends KMeansClusterer
The KMeansPPClusterer
class cluster several vectors/points using
the k-means++ clustering technique.
-----------------------------------------------------------------------------
- See also
ilpubs.stanford.edu:8090/778/1/2006-13.pdf -----------------------------------------------------------------------------
- Alphabetic
- By Inheritance
- KMeansPPClusterer
- KMeansClusterer
- Error
- Clusterer
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
- new KMeansPPClusterer(x: MatriD, k: Int, algo: Algorithm.Algorithm = HARTIGAN, flags: Array[Boolean] = Array (false, false))
- x
the vectors/points to be clustered stored as rows of a matrix
- k
the number of clusters to make
- algo
the clustering algorithm to use
- flags
the flags used to adjust the algorithm
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- val MAX_ITER: Int
- Attributes
- protected
- Definition Classes
- KMeansClusterer
- var _k: Int
- Attributes
- protected
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def assign(): Unit
Randomly assign each vector/point 'x(i)' to a random cluster.
Randomly assign each vector/point 'x(i)' to a random cluster. Primary technique for initiating the clustering.
- Attributes
- protected
- Definition Classes
- KMeansClusterer
- def calcCentroids(x: MatriD, to_c: Array[Int], sz: VectoI, cent: MatriD): Unit
Calculate the centroids based on current assignment of points to clusters and update the 'cent' matrix that stores the centroids in its rows.
Calculate the centroids based on current assignment of points to clusters and update the 'cent' matrix that stores the centroids in its rows.
- x
the data matrix holding the points {x_i = x(i)} in its rows
- to_c
the cluster assignment array
- sz
the sizes of the clusters (number of points)
- cent
the matrix holding the centroids in its rows
- Definition Classes
- Clusterer
- val cent: MatrixD
- Attributes
- protected
- Definition Classes
- KMeansClusterer
- def centroids: MatriD
Return the centroids.
Return the centroids. Should only be called after
train
.- Definition Classes
- KMeansClusterer → Clusterer
- def checkOpt(x: MatriD, to_c: Array[Int], opt: Double): Boolean
Check to see if the sum of squared errors is optimum.
Check to see if the sum of squared errors is optimum.
- x
the data matrix holding the points
- to_c
the cluster assignments
- opt
the known (from human/oracle) optimum
- Definition Classes
- Clusterer
- def classify(z: VectoD): Int
Given a new point/vector 'z', determine which cluster it belongs to, i.e., the cluster whose centroid it is closest to.
Given a new point/vector 'z', determine which cluster it belongs to, i.e., the cluster whose centroid it is closest to.
- z
the vector to classify
- Definition Classes
- KMeansClusterer → Clusterer
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native() @HotSpotIntrinsicCandidate()
- def cluster: Array[Int]
Return the cluster assignment vector.
Return the cluster assignment vector. Should only be called after
train
.- Definition Classes
- KMeansClusterer → Clusterer
- def clusterHartigan(): Array[Int]
Cluster the points using a version of the Hartigan-Wong algorithm.
Cluster the points using a version of the Hartigan-Wong algorithm.
- See also
www.tqmp.org/RegularArticles/vol09-1/p015/p015.pdf
- def csize: VectoI
Return the sizes of the centroids.
Return the sizes of the centroids. Should only be called after
train
.- Definition Classes
- KMeansClusterer → Clusterer
- def distance(u: VectoD, cn: MatriD, kc_: Int = -1): VectoD
Compute the distances between vector/point 'u' and the points stored as rows in matrix 'cn'
Compute the distances between vector/point 'u' and the points stored as rows in matrix 'cn'
- u
the given vector/point (u = x_i)
- cn
the matrix holding several centroids
- kc_
the number of centroids so far
- Definition Classes
- Clusterer
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def fixEmptyClusters(): Unit
Fix all empty clusters by taking a point from the largest cluster.
Fix all empty clusters by taking a point from the largest cluster.
- Attributes
- protected
- Definition Classes
- KMeansClusterer
- val flags: Array[Boolean]
- Definition Classes
- KMeansClusterer
- final def flaw(method: String, message: String): Unit
- Definition Classes
- Error
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
- val immediate: Boolean
- Attributes
- protected
- Definition Classes
- KMeansClusterer
- def initCentroids(): Boolean
Initialize the centroids according to the k-means++ technique.
Initialize the centroids according to the k-means++ technique.
- Definition Classes
- KMeansPPClusterer → Clusterer
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def name(c: Int): String
Return the name of the 'c'-th cluster.
- def name_(nm: Strings): Unit
Set the names for the clusters.
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
- val pdf: VectorD
- Attributes
- protected
- val post: Boolean
- Attributes
- protected
- Definition Classes
- KMeansClusterer
- var raniv: PermutedVecI
- Attributes
- protected
- Definition Classes
- KMeansClusterer
- def reassign(): Boolean
Reassign each vector/point to the cluster with the closest centroid.
Reassign each vector/point to the cluster with the closest centroid. Indicate done, if no points changed clusters (for stopping rule).
- Attributes
- protected
- Definition Classes
- KMeansClusterer
- def setStream(s: Int): Unit
Set the random stream to 's'.
Set the random stream to 's'. Method must be called in implemeting classes before creating any random generators.
- s
the new value for the random number stream
- Definition Classes
- Clusterer
- def show(l: Int): Unit
Show the state of the algorithm at iteration 'l'.
Show the state of the algorithm at iteration 'l'.
- l
the current iteration
- Definition Classes
- KMeansClusterer
- def sse(x: MatriD, c: Int, to_c: Array[Int]): Double
Compute the sum of squared errors from the points in cluster 'c' to the cluster's centroid.
Compute the sum of squared errors from the points in cluster 'c' to the cluster's centroid.
- x
the data matrix holding the points
- c
the current cluster
- to_c
the cluster assignments
- Definition Classes
- Clusterer
- def sse(x: MatriD, to_c: Array[Int]): Double
Compute the sum of squared errors within all clusters, where error is indicated by e.g., the distance from a point to its centroid.
Compute the sum of squared errors within all clusters, where error is indicated by e.g., the distance from a point to its centroid.
- x
the data matrix holding the points
- to_c
the cluster assignments
- Definition Classes
- Clusterer
- def sst(x: MatriD): Double
Compute the sum of squares total for all the points from the mean.
Compute the sum of squares total for all the points from the mean.
- x
the data matrix holding the points
- Definition Classes
- Clusterer
- val stream: Int
- Attributes
- protected
- Definition Classes
- Clusterer
- def swap(): Unit
Try all pairwise swaps and make them if 'sse' improves.
Try all pairwise swaps and make them if 'sse' improves.
- Attributes
- protected
- Definition Classes
- KMeansClusterer
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- val sz: VectorI
- Attributes
- protected
- Definition Classes
- KMeansClusterer
- def toString(): String
- Definition Classes
- AnyRef → Any
- val to_c: Array[Int]
- Attributes
- protected
- Definition Classes
- KMeansClusterer
- def train(): KMeansPPClusterer
Given a set of points/vectors, put them in clusters, returning the cluster assignment vector.
Given a set of points/vectors, put them in clusters, returning the cluster assignment vector. A basic goal is to minimize the sum of the distances between points within each cluster.
- Definition Classes
- KMeansPPClusterer → KMeansClusterer → Clusterer
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated