class KMeansClusterer extends Clusterer with Error
The KMeansClusterer
class cluster several vectors/points using k-means
clustering. Either (1) randomly assign points to 'k' clusters or (2) randomly
pick 'k' points as initial centroids (technique (1) to work better and is the
primary technique). Iteratively, reassign each point to the cluster containing
the closest centroid. Stop when there are no changes to the clusters.
-----------------------------------------------------------------------------
- Alphabetic
- By Inheritance
- KMeansClusterer
- Error
- Clusterer
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
KMeansClusterer(x: MatrixD, k: Int, s: Int = 0, primary: Boolean = true, remote: Boolean = true, post: Boolean = true)
- x
the vectors/points to be clustered stored as rows of a matrix
- k
the number of clusters to make
- s
the random number stream (to vary the clusters made)
- primary
true indicates use the primary technique for initiating the clustering
- remote
whether to take a maximally remote or a randomly selected point
- post
whether to perform post processing by randomly swapping points to reduce error
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
val
DEBUG: Boolean
- Attributes
- protected
-
val
MAX_ITER: Int
- Attributes
- protected
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
assign(): Unit
Randomly assign each vector/point 'x(i)' to a random cluster.
Randomly assign each vector/point 'x(i)' to a random cluster. Primary technique for initiating the clustering.
-
def
calcCentroids(): Unit
Calculate the centroids based on current assignment of points to clusters.
-
val
cent: MatrixD
- Attributes
- protected
-
def
centroids(): MatrixD
Return the centroids.
Return the centroids. Should only be called after
cluster ()
.- Definition Classes
- KMeansClusterer → Clusterer
-
def
checkOpt(opt: Double): Boolean
Check to see if the sum of squared errors is optimum.
Check to see if the sum of squared errors is optimum.
- opt
the known (from human/oracle) optimum
-
def
classify(y: VectorD): Int
Given a new point/vector 'y', determine which cluster it belongs to, i.e., the cluster whose centroid it is closest to.
Given a new point/vector 'y', determine which cluster it belongs to, i.e., the cluster whose centroid it is closest to.
- y
the vector to classify
- Definition Classes
- KMeansClusterer → Clusterer
-
def
clone(): AnyRef
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
-
def
cluster(): Array[Int]
Iteratively recompute clusters until the assignment of points does not change, returning the final cluster assignment vector.
Iteratively recompute clusters until the assignment of points does not change, returning the final cluster assignment vector.
- Definition Classes
- KMeansClusterer → Clusterer
-
val
clustered: Boolean
Flag indicating whether the points have already been clusterer
Flag indicating whether the points have already been clusterer
- Attributes
- protected
- Definition Classes
- Clusterer
-
val
clustr: Array[Int]
- Attributes
- protected
-
def
csize(): VectorI
Return the sizes of the centroids.
Return the sizes of the centroids. Should only be called after
cluster ()
.- Definition Classes
- KMeansClusterer → Clusterer
-
val
dist: VectorD
- Attributes
- protected
-
def
distance(u: VectorD, v: VectorD): Double
Compute a distance metric (e.g., distance squared) between vectors/points 'u' and 'v'.
Compute a distance metric (e.g., distance squared) between vectors/points 'u' and 'v'. Override this methods to use a different metric, e.g., 'norm' - the Euclidean distance, 2-norm 'norm1' - the Manhattan distance, 1-norm
- u
the first vector/point
- v
the second vector/point
- Definition Classes
- Clusterer
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
flaw(method: String, message: String): Unit
- Definition Classes
- Error
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
getName(i: Int): String
Get the name of the i-th cluster.
Get the name of the i-th cluster.
- Definition Classes
- Clusterer
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
val
myDist: VectorD
- Attributes
- protected
-
def
name_(n: Array[String]): Unit
Set the names for the clusters.
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
pickCentroids(): Unit
Randomly pick vectors/points to serve as the initial 'k' centroids (cent).
Randomly pick vectors/points to serve as the initial 'k' centroids (cent). Secondary technique for initiating the clustering.
-
def
reassign(): Boolean
Reassign each vector/point to the cluster with the closest centroid.
Reassign each vector/point to the cluster with the closest centroid. Indicate done, if no points changed clusters (for stopping rule).
-
val
sizes: VectorI
- Attributes
- protected
-
def
sse(c: Int): Double
Compute the sum of squared errors (distance squared) from all points in cluster 'c' to the cluster's centroid.
Compute the sum of squared errors (distance squared) from all points in cluster 'c' to the cluster's centroid.
- c
the current cluster
-
def
sse(x: MatrixD): Double
Compute the sum of squared errors within the clusters, where error is indicated by e.g., the distance from a point to its centroid.
Compute the sum of squared errors within the clusters, where error is indicated by e.g., the distance from a point to its centroid.
- Definition Classes
- Clusterer
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
- var tc1: Double
- var tc2: Double
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )