KMeansPPClusterer

scalation.modeling.clustering.KMeansPPClusterer
See theKMeansPPClusterer companion object
class KMeansPPClusterer(x: MatrixD, k: Int, algo: Algorithm, flags: Array[Boolean]) extends KMeansClusterer

Value parameters

algo

the clustering algorithm to use

flags

the flags used to adjust the algorithm

k

the number of clusters to make

x

the vectors/points to be clustered stored as rows of a matrix

Attributes

See also
Companion
object
Graph
Supertypes
trait Clusterer
class Object
trait Matchable
class Any

Members list

Value members

Concrete methods

def clusterHartigan(): Array[Int]

Cluster the points using a version of the Hartigan-Wong algorithm.

Cluster the points using a version of the Hartigan-Wong algorithm.

Attributes

See also
override def initCentroids(): Boolean

Initialize the centroids according to the k-means++ technique.

Initialize the centroids according to the k-means++ technique.

Attributes

Definition Classes
override def train(): Unit

Given a set of points/vectors, put them in clusters, returning the cluster assignment vector. A basic goal is to minimize the sum of the distances between points within each cluster.

Given a set of points/vectors, put them in clusters, returning the cluster assignment vector. A basic goal is to minimize the sum of the distances between points within each cluster.

Attributes

Definition Classes

Inherited methods

def calcCentroids(x: MatrixD, to_c: Array[Int], sz: VectorI, cent: MatrixD): Unit

Calculate the centroids based on current assignment of points to clusters and update the 'cent' matrix that stores the centroids in its rows.

Calculate the centroids based on current assignment of points to clusters and update the 'cent' matrix that stores the centroids in its rows.

Value parameters

cent

the matrix holding the centroids in its rows

sz

the sizes of the clusters (number of points)

to_c

the cluster assignment array

x

the data matrix holding the points {x_i = x(i)} in its rows

Attributes

Inherited from:
Clusterer

Return the centroids. Should only be called after train.

Return the centroids. Should only be called after train.

Attributes

Inherited from:
KMeansClusterer
def checkOpt(x: MatrixD, to_c: Array[Int], opt: Double): Boolean

Check to see if the sum of squared errors is optimum.

Check to see if the sum of squared errors is optimum.

Value parameters

opt

the known (from human/oracle) optimum

to_c

the cluster assignments

x

the data matrix holding the points

Attributes

Inherited from:
Clusterer
def classify(z: VectorD): Int

Given a new point/vector z, determine which cluster it belongs to, i.e., the cluster whose centroid it is closest to.

Given a new point/vector z, determine which cluster it belongs to, i.e., the cluster whose centroid it is closest to.

Value parameters

z

the vector to classify

Attributes

Inherited from:
KMeansClusterer
def cluster: Array[Int]

Return the cluster assignment vector. Should only be called after train.

Return the cluster assignment vector. Should only be called after train.

Attributes

Inherited from:
KMeansClusterer
def csize: VectorI

Return the sizes of the centroids. Should only be called after train.

Return the sizes of the centroids. Should only be called after train.

Attributes

Inherited from:
KMeansClusterer
def distance(u: VectorD, cn: MatrixD, kc_: Int): VectorD

Compute the distances between vector/point 'u' and the points stored as rows in matrix 'cn'

Compute the distances between vector/point 'u' and the points stored as rows in matrix 'cn'

Value parameters

cn

the matrix holding several centroids

kc_

the number of centroids so far

u

the given vector/point (u = x_i)

Attributes

Inherited from:
Clusterer
def name(c: Int): String

Return the name of the 'c'-th cluster.

Return the name of the 'c'-th cluster.

Value parameters

c

the c-th cluster

Attributes

Inherited from:
Clusterer
def name_(nm: Array[String]): Unit

Set the names for the clusters.

Set the names for the clusters.

Value parameters

nm

the array of names

Attributes

Inherited from:
Clusterer
def setStream(s: Int): Unit

Set the random stream to 's'. Method must be called in implemeting classes before creating any random generators.

Set the random stream to 's'. Method must be called in implemeting classes before creating any random generators.

Value parameters

s

the new value for the random number stream

Attributes

Inherited from:
Clusterer
def show(l: Int): Unit

Show the state of the algorithm at iteration l.

Show the state of the algorithm at iteration l.

Value parameters

l

the current iteration

Attributes

Inherited from:
KMeansClusterer
def sse(x: MatrixD, c: Int, to_c: Array[Int]): Double

Compute the sum of squared errors from the points in cluster 'c' to the cluster's centroid.

Compute the sum of squared errors from the points in cluster 'c' to the cluster's centroid.

Value parameters

c

the current cluster

to_c

the cluster assignments

x

the data matrix holding the points

Attributes

Inherited from:
Clusterer
def sse(x: MatrixD, to_c: Array[Int]): Double

Compute the sum of squared errors within all clusters, where error is indicated by e.g., the distance from a point to its centroid.

Compute the sum of squared errors within all clusters, where error is indicated by e.g., the distance from a point to its centroid.

Value parameters

to_c

the cluster assignments

x

the data matrix holding the points

Attributes

Inherited from:
Clusterer
def sst(x: MatrixD): Double

Compute the sum of squares total for all the points from the mean.

Compute the sum of squares total for all the points from the mean.

Value parameters

x

the data matrix holding the points

Attributes

Inherited from:
Clusterer