GapStatistic

object GapStatistic

The GapStatistic object is used to help determine the optimal number of clusters for a clusterer by comparing results to a reference distribution. -----------------------------------------------------------------------------

See also: web.stanford.edu/~hastie/Papers/gap.pdf

Linear Supertypes

AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

GapStatistic
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@native() @throws( ... )
def cumDistance(x: MatrixD, cl: Clusterer, clustr: Array[Int], k: Int): VectorD
Compute a sum of pairwise distances between points in each cluster (in one direction).
Compute a sum of pairwise distances between points in each cluster (in one direction).
x
the vectors/points to be clustered stored as rows of a matrix
cl
the Clusterer use to compute the distance metric
clustr
the cluster assignments
k
the number of clusters
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native()
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def kMeansPP(x: MatrixD, kMax: Int, algo: Algorithm = HARTIGAN, b: Int = 1, useSVD: Boolean = true, plot: Boolean = false): (KMeansPPClusterer, Array[Int], Int)
Return a KMeansPPClusterer clustering on the given points with an optimal number of clusters k chosen using the Gap statistic.
Return a KMeansPPClusterer clustering on the given points with an optimal number of clusters k chosen using the Gap statistic.
x
the vectors/points to be clustered stored as rows of a matrix
kMax
the upper bound on the number of clusters
algo
the reassignment aslgorithm used by KMeansPlusPlusClusterer
b
the number of reference distributions to create (default = 1)
useSVD
use SVD to account for the shape of the points (default = true)
plot
whether or not to plot the logs of the within-SSEs (default = false)
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
def reference(x: MatrixD, useSVD: Boolean = true, stream: Int = 0): MatrixD
Compute a reference distribution based on a set of points.
Compute a reference distribution based on a set of points.
x
the vectors/points to be clustered stored as rows of a matrix
useSVD
use SVD to account for the shape of the points (default = true)
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@native() @throws( ... )
def withinSSE(x: MatrixD, cl: Clusterer, clustr: Array[Int], k: Int): Double
Compute the within sum of squared errors in terms of distances between between points within a cluster (in one direction).
Compute the within sum of squared errors in terms of distances between between points within a cluster (in one direction).
x
the vectors/points to be clustered stored as rows of a matrix
cl
the Clusterer use to compute the distance metric
clustr
the cluster assignments
k
the number of clusters

Inherited from AnyRef

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@native() @throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native()
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@native() @throws( ... )

Inherited from Any

Value Members

final def asInstanceOf[T0]: T0

Definition Classes
Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any

Ungrouped

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@native() @throws( ... )
def cumDistance(x: MatrixD, cl: Clusterer, clustr: Array[Int], k: Int): VectorD
Compute a sum of pairwise distances between points in each cluster (in one direction).
Compute a sum of pairwise distances between points in each cluster (in one direction).
x
the vectors/points to be clustered stored as rows of a matrix
cl
the Clusterer use to compute the distance metric
clustr
the cluster assignments
k
the number of clusters
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native()
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def kMeansPP(x: MatrixD, kMax: Int, algo: Algorithm = HARTIGAN, b: Int = 1, useSVD: Boolean = true, plot: Boolean = false): (KMeansPPClusterer, Array[Int], Int)
Return a KMeansPPClusterer clustering on the given points with an optimal number of clusters k chosen using the Gap statistic.
Return a KMeansPPClusterer clustering on the given points with an optimal number of clusters k chosen using the Gap statistic.
x
the vectors/points to be clustered stored as rows of a matrix
kMax
the upper bound on the number of clusters
algo
the reassignment aslgorithm used by KMeansPlusPlusClusterer
b
the number of reference distributions to create (default = 1)
useSVD
use SVD to account for the shape of the points (default = true)
plot
whether or not to plot the logs of the within-SSEs (default = false)
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
def reference(x: MatrixD, useSVD: Boolean = true, stream: Int = 0): MatrixD
Compute a reference distribution based on a set of points.
Compute a reference distribution based on a set of points.
x
the vectors/points to be clustered stored as rows of a matrix
useSVD
use SVD to account for the shape of the points (default = true)
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@native() @throws( ... )
def withinSSE(x: MatrixD, cl: Clusterer, clustr: Array[Int], k: Int): Double
Compute the within sum of squared errors in terms of distances between between points within a cluster (in one direction).
Compute the within sum of squared errors in terms of distances between between points within a cluster (in one direction).
x
the vectors/points to be clustered stored as rows of a matrix
cl
the Clusterer use to compute the distance metric
clustr
the cluster assignments
k
the number of clusters

Packages

GapStatistic 

object GapStatistic

Value Members

Inherited from AnyRef

Value Members

Inherited from Any

Value Members

Ungrouped

GapStatistic