KMeansClusterer

class KMeansClusterer extends Clusterer with Error

The KMeansClusterer class cluster several vectors/points using k-means clustering. Randomly assign points to 'k' clusters (primary technique). Iteratively, reassign each point to the cluster containing the closest centroid. Stop when there are no changes to the clusters.

See also: KMeansClusterer2 for secondary technique. -----------------------------------------------------------------------------

Linear Supertypes

Error, Clusterer, AnyRef, Any

Known Subclasses

KMeansClusterer2, KMeansClustererHW, KMeansClustererPP, KMeansPPClusterer

Ordering

Alphabetic
By Inheritance

Inherited

KMeansClusterer
Error
Clusterer
AnyRef
Any

Hide All
Show All

Visibility

Public
Protected

Instance Constructors

new KMeansClusterer(x: MatriD, k: Int, flags: Array[Boolean] = Array (false, false))
x
the vectors/points to be clustered stored as rows of a matrix
k
the number of clusters to make
flags
the array of flags used to adjust the algorithm default: no post processing, no immediate return upon change

Value Members

final def !=(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def ##: Int
Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean
Definition Classes
AnyRef → Any
val MAX_ITER: Int
Attributes
protected
final def asInstanceOf[T0]: T0
Definition Classes
Any
def assign(): Unit
Randomly assign each vector/point 'x(i)' to a random cluster.
Randomly assign each vector/point 'x(i)' to a random cluster. Primary technique for initiating the clustering.
Attributes
protected
def calcCentroids(x: MatriD, to_c: Array[Int], sz: VectoI, cent: MatriD): Unit
Calculate the centroids based on current assignment of points to clusters and update the 'cent' matrix that stores the centroids in its rows.
Calculate the centroids based on current assignment of points to clusters and update the 'cent' matrix that stores the centroids in its rows.
x
the data matrix holding the points {x_i = x(i)} in its rows
to_c
the cluster assignment array
sz
the sizes of the clusters (number of points)
cent
the matrix holding the centroids in its rows
Definition Classes
Clusterer
val cent: MatrixD
Attributes
protected
def centroids: MatriD
Return the centroids.
Return the centroids. Should only be called after train.
Definition Classes
KMeansClusterer → Clusterer
def checkOpt(x: MatriD, to_c: Array[Int], opt: Double): Boolean
Check to see if the sum of squared errors is optimum.
Check to see if the sum of squared errors is optimum.
x
the data matrix holding the points
to_c
the cluster assignments
opt
the known (from human/oracle) optimum
Definition Classes
Clusterer
def classify(z: VectoD): Int
Given a new point/vector 'z', determine which cluster it belongs to, i.e., the cluster whose centroid it is closest to.
Given a new point/vector 'z', determine which cluster it belongs to, i.e., the cluster whose centroid it is closest to.
z
the vector to classify
Definition Classes
KMeansClusterer → Clusterer
def clone(): AnyRef
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.CloneNotSupportedException]) @native() @HotSpotIntrinsicCandidate()
def cluster: Array[Int]
Return the cluster assignment vector.
Return the cluster assignment vector. Should only be called after train.
Definition Classes
KMeansClusterer → Clusterer
def csize: VectoI
Return the sizes of the centroids.
Return the sizes of the centroids. Should only be called after train.
Definition Classes
KMeansClusterer → Clusterer
def distance(u: VectoD, cn: MatriD, kc_: Int = -1): VectoD
Compute the distances between vector/point 'u' and the points stored as rows in matrix 'cn'
Compute the distances between vector/point 'u' and the points stored as rows in matrix 'cn'
u
the given vector/point (u = x_i)
cn
the matrix holding several centroids
kc_
the number of centroids so far
Definition Classes
Clusterer
final def eq(arg0: AnyRef): Boolean
Definition Classes
AnyRef
def equals(arg0: AnyRef): Boolean
Definition Classes
AnyRef → Any
def fixEmptyClusters(): Unit
Fix all empty clusters by taking a point from the largest cluster.
Fix all empty clusters by taking a point from the largest cluster.
Attributes
protected
val flags: Array[Boolean]
final def flaw(method: String, message: String): Unit
Definition Classes
Error
final def getClass(): Class[_ <: AnyRef]
Definition Classes
AnyRef → Any
Annotations
@native() @HotSpotIntrinsicCandidate()
def hashCode(): Int
Definition Classes
AnyRef → Any
Annotations
@native() @HotSpotIntrinsicCandidate()
val immediate: Boolean
Attributes
protected
def initCentroids(): Boolean
Definition Classes
Clusterer
final def isInstanceOf[T0]: Boolean
Definition Classes
Any
def name(c: Int): String
Return the name of the 'c'-th cluster.
Return the name of the 'c'-th cluster.
c
the c-th cluster
Definition Classes
Clusterer
def name_(nm: Strings): Unit
Set the names for the clusters.
Set the names for the clusters.
nm
the array of names
Definition Classes
Clusterer
final def ne(arg0: AnyRef): Boolean
Definition Classes
AnyRef
final def notify(): Unit
Definition Classes
AnyRef
Annotations
@native() @HotSpotIntrinsicCandidate()
final def notifyAll(): Unit
Definition Classes
AnyRef
Annotations
@native() @HotSpotIntrinsicCandidate()
val post: Boolean
Attributes
protected
var raniv: PermutedVecI
Attributes
protected
def reassign(): Boolean
Reassign each vector/point to the cluster with the closest centroid.
Reassign each vector/point to the cluster with the closest centroid. Indicate done, if no points changed clusters (for stopping rule).
Attributes
protected
def setStream(s: Int): Unit
Set the random stream to 's'.
Set the random stream to 's'. Method must be called in implemeting classes before creating any random generators.
s
the new value for the random number stream
Definition Classes
Clusterer
def show(l: Int): Unit
Show the state of the algorithm at iteration 'l'.
Show the state of the algorithm at iteration 'l'.
l
the current iteration
def sse(x: MatriD, c: Int, to_c: Array[Int]): Double
Compute the sum of squared errors from the points in cluster 'c' to the cluster's centroid.
Compute the sum of squared errors from the points in cluster 'c' to the cluster's centroid.
x
the data matrix holding the points
c
the current cluster
to_c
the cluster assignments
Definition Classes
Clusterer
def sse(x: MatriD, to_c: Array[Int]): Double
Compute the sum of squared errors within all clusters, where error is indicated by e.g., the distance from a point to its centroid.
Compute the sum of squared errors within all clusters, where error is indicated by e.g., the distance from a point to its centroid.
x
the data matrix holding the points
to_c
the cluster assignments
Definition Classes
Clusterer
def sst(x: MatriD): Double
Compute the sum of squares total for all the points from the mean.
Compute the sum of squares total for all the points from the mean.
x
the data matrix holding the points
Definition Classes
Clusterer
val stream: Int
Attributes
protected
Definition Classes
Clusterer
def swap(): Unit
Try all pairwise swaps and make them if 'sse' improves.
Try all pairwise swaps and make them if 'sse' improves.
Attributes
protected
final def synchronized[T0](arg0: => T0): T0
Definition Classes
AnyRef
val sz: VectorI
Attributes
protected
def toString(): String
Definition Classes
AnyRef → Any
val to_c: Array[Int]
Attributes
protected
def train(): KMeansClusterer
Iteratively recompute clusters until the assignment of points does not change.
Iteratively recompute clusters until the assignment of points does not change. Initialize by randomly assigning points to 'k' clusters.
Definition Classes
KMeansClusterer → Clusterer
final def wait(arg0: Long, arg1: Int): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException]) @native()
final def wait(): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])

Inherited from Error

Value Members

final def flaw(method: String, message: String): Unit
Definition Classes
Error

Inherited from Clusterer

Value Members

def calcCentroids(x: MatriD, to_c: Array[Int], sz: VectoI, cent: MatriD): Unit
Calculate the centroids based on current assignment of points to clusters and update the 'cent' matrix that stores the centroids in its rows.
Calculate the centroids based on current assignment of points to clusters and update the 'cent' matrix that stores the centroids in its rows.
x
the data matrix holding the points {x_i = x(i)} in its rows
to_c
the cluster assignment array
sz
the sizes of the clusters (number of points)
cent
the matrix holding the centroids in its rows
Definition Classes
Clusterer
def checkOpt(x: MatriD, to_c: Array[Int], opt: Double): Boolean
Check to see if the sum of squared errors is optimum.
Check to see if the sum of squared errors is optimum.
x
the data matrix holding the points
to_c
the cluster assignments
opt
the known (from human/oracle) optimum
Definition Classes
Clusterer
def distance(u: VectoD, cn: MatriD, kc_: Int = -1): VectoD
Compute the distances between vector/point 'u' and the points stored as rows in matrix 'cn'
Compute the distances between vector/point 'u' and the points stored as rows in matrix 'cn'
u
the given vector/point (u = x_i)
cn
the matrix holding several centroids
kc_
the number of centroids so far
Definition Classes
Clusterer
def initCentroids(): Boolean
Definition Classes
Clusterer
def name(c: Int): String
Return the name of the 'c'-th cluster.
Return the name of the 'c'-th cluster.
c
the c-th cluster
Definition Classes
Clusterer
def name_(nm: Strings): Unit
Set the names for the clusters.
Set the names for the clusters.
nm
the array of names
Definition Classes
Clusterer
def setStream(s: Int): Unit
Set the random stream to 's'.
Set the random stream to 's'. Method must be called in implemeting classes before creating any random generators.
s
the new value for the random number stream
Definition Classes
Clusterer
def sse(x: MatriD, c: Int, to_c: Array[Int]): Double
Compute the sum of squared errors from the points in cluster 'c' to the cluster's centroid.
Compute the sum of squared errors from the points in cluster 'c' to the cluster's centroid.
x
the data matrix holding the points
c
the current cluster
to_c
the cluster assignments
Definition Classes
Clusterer
def sse(x: MatriD, to_c: Array[Int]): Double
Compute the sum of squared errors within all clusters, where error is indicated by e.g., the distance from a point to its centroid.
Compute the sum of squared errors within all clusters, where error is indicated by e.g., the distance from a point to its centroid.
x
the data matrix holding the points
to_c
the cluster assignments
Definition Classes
Clusterer
def sst(x: MatriD): Double
Compute the sum of squares total for all the points from the mean.
Compute the sum of squares total for all the points from the mean.
x
the data matrix holding the points
Definition Classes
Clusterer
val stream: Int
Attributes
protected
Definition Classes
Clusterer

Inherited from AnyRef

Value Members

final def !=(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def ##: Int
Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean
Definition Classes
AnyRef → Any
def clone(): AnyRef
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.CloneNotSupportedException]) @native() @HotSpotIntrinsicCandidate()
final def eq(arg0: AnyRef): Boolean
Definition Classes
AnyRef
def equals(arg0: AnyRef): Boolean
Definition Classes
AnyRef → Any
final def getClass(): Class[_ <: AnyRef]
Definition Classes
AnyRef → Any
Annotations
@native() @HotSpotIntrinsicCandidate()
def hashCode(): Int
Definition Classes
AnyRef → Any
Annotations
@native() @HotSpotIntrinsicCandidate()
final def ne(arg0: AnyRef): Boolean
Definition Classes
AnyRef
final def notify(): Unit
Definition Classes
AnyRef
Annotations
@native() @HotSpotIntrinsicCandidate()
final def notifyAll(): Unit
Definition Classes
AnyRef
Annotations
@native() @HotSpotIntrinsicCandidate()
final def synchronized[T0](arg0: => T0): T0
Definition Classes
AnyRef
def toString(): String
Definition Classes
AnyRef → Any
final def wait(arg0: Long, arg1: Int): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException]) @native()
final def wait(): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
def finalize(): Unit
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.Throwable]) @Deprecated
Deprecated

Inherited from Any

Value Members

final def asInstanceOf[T0]: T0
Definition Classes
Any
final def isInstanceOf[T0]: Boolean
Definition Classes
Any

Ungrouped

final def !=(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def ##: Int
Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean
Definition Classes
AnyRef → Any
val MAX_ITER: Int
Attributes
protected
final def asInstanceOf[T0]: T0
Definition Classes
Any
def assign(): Unit
Randomly assign each vector/point 'x(i)' to a random cluster.
Randomly assign each vector/point 'x(i)' to a random cluster. Primary technique for initiating the clustering.
Attributes
protected
def calcCentroids(x: MatriD, to_c: Array[Int], sz: VectoI, cent: MatriD): Unit
Calculate the centroids based on current assignment of points to clusters and update the 'cent' matrix that stores the centroids in its rows.
Calculate the centroids based on current assignment of points to clusters and update the 'cent' matrix that stores the centroids in its rows.
x
the data matrix holding the points {x_i = x(i)} in its rows
to_c
the cluster assignment array
sz
the sizes of the clusters (number of points)
cent
the matrix holding the centroids in its rows
Definition Classes
Clusterer
val cent: MatrixD
Attributes
protected
def centroids: MatriD
Return the centroids.
Return the centroids. Should only be called after train.
Definition Classes
KMeansClusterer → Clusterer
def checkOpt(x: MatriD, to_c: Array[Int], opt: Double): Boolean
Check to see if the sum of squared errors is optimum.
Check to see if the sum of squared errors is optimum.
x
the data matrix holding the points
to_c
the cluster assignments
opt
the known (from human/oracle) optimum
Definition Classes
Clusterer
def classify(z: VectoD): Int
Given a new point/vector 'z', determine which cluster it belongs to, i.e., the cluster whose centroid it is closest to.
Given a new point/vector 'z', determine which cluster it belongs to, i.e., the cluster whose centroid it is closest to.
z
the vector to classify
Definition Classes
KMeansClusterer → Clusterer
def clone(): AnyRef
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.CloneNotSupportedException]) @native() @HotSpotIntrinsicCandidate()
def cluster: Array[Int]
Return the cluster assignment vector.
Return the cluster assignment vector. Should only be called after train.
Definition Classes
KMeansClusterer → Clusterer
def csize: VectoI
Return the sizes of the centroids.
Return the sizes of the centroids. Should only be called after train.
Definition Classes
KMeansClusterer → Clusterer
def distance(u: VectoD, cn: MatriD, kc_: Int = -1): VectoD
Compute the distances between vector/point 'u' and the points stored as rows in matrix 'cn'
Compute the distances between vector/point 'u' and the points stored as rows in matrix 'cn'
u
the given vector/point (u = x_i)
cn
the matrix holding several centroids
kc_
the number of centroids so far
Definition Classes
Clusterer
final def eq(arg0: AnyRef): Boolean
Definition Classes
AnyRef
def equals(arg0: AnyRef): Boolean
Definition Classes
AnyRef → Any
def fixEmptyClusters(): Unit
Fix all empty clusters by taking a point from the largest cluster.
Fix all empty clusters by taking a point from the largest cluster.
Attributes
protected
val flags: Array[Boolean]
final def flaw(method: String, message: String): Unit
Definition Classes
Error
final def getClass(): Class[_ <: AnyRef]
Definition Classes
AnyRef → Any
Annotations
@native() @HotSpotIntrinsicCandidate()
def hashCode(): Int
Definition Classes
AnyRef → Any
Annotations
@native() @HotSpotIntrinsicCandidate()
val immediate: Boolean
Attributes
protected
def initCentroids(): Boolean
Definition Classes
Clusterer
final def isInstanceOf[T0]: Boolean
Definition Classes
Any
def name(c: Int): String
Return the name of the 'c'-th cluster.
Return the name of the 'c'-th cluster.
c
the c-th cluster
Definition Classes
Clusterer
def name_(nm: Strings): Unit
Set the names for the clusters.
Set the names for the clusters.
nm
the array of names
Definition Classes
Clusterer
final def ne(arg0: AnyRef): Boolean
Definition Classes
AnyRef
final def notify(): Unit
Definition Classes
AnyRef
Annotations
@native() @HotSpotIntrinsicCandidate()
final def notifyAll(): Unit
Definition Classes
AnyRef
Annotations
@native() @HotSpotIntrinsicCandidate()
val post: Boolean
Attributes
protected
var raniv: PermutedVecI
Attributes
protected
def reassign(): Boolean
Reassign each vector/point to the cluster with the closest centroid.
Reassign each vector/point to the cluster with the closest centroid. Indicate done, if no points changed clusters (for stopping rule).
Attributes
protected
def setStream(s: Int): Unit
Set the random stream to 's'.
Set the random stream to 's'. Method must be called in implemeting classes before creating any random generators.
s
the new value for the random number stream
Definition Classes
Clusterer
def show(l: Int): Unit
Show the state of the algorithm at iteration 'l'.
Show the state of the algorithm at iteration 'l'.
l
the current iteration
def sse(x: MatriD, c: Int, to_c: Array[Int]): Double
Compute the sum of squared errors from the points in cluster 'c' to the cluster's centroid.
Compute the sum of squared errors from the points in cluster 'c' to the cluster's centroid.
x
the data matrix holding the points
c
the current cluster
to_c
the cluster assignments
Definition Classes
Clusterer
def sse(x: MatriD, to_c: Array[Int]): Double
Compute the sum of squared errors within all clusters, where error is indicated by e.g., the distance from a point to its centroid.
Compute the sum of squared errors within all clusters, where error is indicated by e.g., the distance from a point to its centroid.
x
the data matrix holding the points
to_c
the cluster assignments
Definition Classes
Clusterer
def sst(x: MatriD): Double
Compute the sum of squares total for all the points from the mean.
Compute the sum of squares total for all the points from the mean.
x
the data matrix holding the points
Definition Classes
Clusterer
val stream: Int
Attributes
protected
Definition Classes
Clusterer
def swap(): Unit
Try all pairwise swaps and make them if 'sse' improves.
Try all pairwise swaps and make them if 'sse' improves.
Attributes
protected
final def synchronized[T0](arg0: => T0): T0
Definition Classes
AnyRef
val sz: VectorI
Attributes
protected
def toString(): String
Definition Classes
AnyRef → Any
val to_c: Array[Int]
Attributes
protected
def train(): KMeansClusterer
Iteratively recompute clusters until the assignment of points does not change.
Iteratively recompute clusters until the assignment of points does not change. Initialize by randomly assigning points to 'k' clusters.
Definition Classes
KMeansClusterer → Clusterer
final def wait(arg0: Long, arg1: Int): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException]) @native()
final def wait(): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
def finalize(): Unit
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.Throwable]) @Deprecated
Deprecated

Packages

KMeansClusterer

class KMeansClusterer extends Clusterer with Error

Instance Constructors

Value Members

Deprecated Value Members

Inherited from Error

Value Members

Inherited from Clusterer

Value Members

Inherited from AnyRef

Value Members

Inherited from Any

Value Members

Ungrouped

Packages

KMeansClusterer

class KMeansClusterer extends Clusterer with Error

Instance Constructors

Value Members

Deprecated Value Members

Inherited from Error

Value Members

Inherited from Clusterer

Value Members

Inherited from AnyRef

Value Members

Inherited from Any

Value Members

Ungrouped

KMeansClusterer