Packages

class GMM extends Classifier

The GMM class is used for univariate Gaussian Mixture Models. Given a sample, thought to be generated according to 'k' Normal distributions, estimate the values for the 'mu' and 'sig2' parameters for the Normal distributions. Given a new value, determine which class (0, ..., k-1) it is most likely to have come from. FIX: need a class for multivariate Gaussian Mixture Models. FIX: need to adapt for clustering. -----------------------------------------------------------------------------

Linear Supertypes
Classifier, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. GMM
  2. Classifier
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new GMM(x: VectoD, k: Int = 3)

    x

    the data vector

    k

    the number of components in the mixture

Value Members

  1. def classify(z: VectoI): (Int, String, Double)

    Classify the first point in vector 'z'.

    Classify the first point in vector 'z'.

    z

    the vector to be classified.

    Definition Classes
    GMMClassifier
  2. def classify(z: VectoD): (Int, String, Double)

    Classify the first point in vector 'z'.

    Classify the first point in vector 'z'.

    z

    the vector to be classified.

    Definition Classes
    GMMClassifier
  3. def crossValidate(nx: Int = 10, show: Boolean = false): Double

    Test the accuracy of the classified results by cross-validation, returning the accuracy.

    Test the accuracy of the classified results by cross-validation, returning the accuracy. The "test data" starts at 'testStart' and ends at 'testEnd', the rest of the data is "training data'. FIX - should return a StatVector

    nx

    the number of crosses and cross-validations (defaults to 10x).

    show

    the show flag (show result from each iteration)

    Definition Classes
    Classifier
  4. def crossValidateRand(nx: Int = 10, show: Boolean = false): Double

    Test the accuracy of the classified results by cross-validation, returning the accuracy.

    Test the accuracy of the classified results by cross-validation, returning the accuracy. This version of cross-validation relies on "subtracting" frequencies from the previously stored global data to achieve efficiency. FIX - are the comments correct? FIX - should return a StatVector

    nx

    number of crosses and cross-validations (defaults to 10x).

    show

    the show flag (show result from each iteration)

    Definition Classes
    Classifier
  5. def exp_step(): Unit

    Execute the Expectation (E) Step in the EM algoithm.

  6. def fit(y: VectoI, yp: VectoI, k: Int = 2): VectoD

    Return the quality of fit including 'acc', 'prec', 'recall', 'kappa'.

    Return the quality of fit including 'acc', 'prec', 'recall', 'kappa'. Override to add more quality of fit measures.

    y

    the actual class labels

    yp

    the precicted class labels

    k

    the number of class labels

    Definition Classes
    Classifier
    See also

    ConfusionMat

    medium.com/greyatom/performance-metrics-for-classification-problems-in-machine-learning-part-i-b085d432082b

  7. def fitLabel: Seq[String]

    Return the labels for the fit.

    Return the labels for the fit. Override when necessary.

    Definition Classes
    Classifier
  8. def max_step(): Unit

    Execute the Maximumization (M) Step in the EM algoithm.

  9. def reset(): Unit

    Reset ...

    Reset ... FIX

    Definition Classes
    GMMClassifier
  10. def size: Int

    Return the size of the feature set.

    Return the size of the feature set.

    Definition Classes
    GMMClassifier
  11. def test(itest: IndexedSeq[Int]): Double

    Test ...

    Test ...

    itest

    the indices of test data

    Definition Classes
    GMMClassifier
  12. def test(testStart: Int, testEnd: Int): Double

    Test the quality of the training with a test dataset and return the fraction of correct classifications.

    Test the quality of the training with a test dataset and return the fraction of correct classifications. Can be used when the dataset is randomized so that the testing/training part of a dataset corresponds to simple slices of vectors and matrices.

    testStart

    the beginning of test region (inclusive).

    testEnd

    the end of test region (exclusive).

    Definition Classes
    Classifier
  13. def train(itest: IndexedSeq[Int]): GMM

    Train the model to determine values for the parameter vectors 'mu' and 'sig2'.

    Train the model to determine values for the parameter vectors 'mu' and 'sig2'.

    itest

    the indices of test data

    Definition Classes
    GMMClassifier
  14. def train(): Classifier

    Train the classifier by computing the probabilities from a training dataset of data vectors and their classifications.

    Train the classifier by computing the probabilities from a training dataset of data vectors and their classifications. Must be implemented in any extending class. Can be used when the whole dataset is used for training.

    Definition Classes
    Classifier
  15. def train(testStart: Int, testEnd: Int): Classifier

    Train the classifier by computing the probabilities from a training dataset of data vectors and their classifications.

    Train the classifier by computing the probabilities from a training dataset of data vectors and their classifications. Must be implemented in any extending class. Can be used when the dataset is randomized so that the training part of a dataset corresponds to simple slices of vectors and matrices.

    testStart

    starting index of test region (inclusive) used in cross-validation

    testEnd

    ending index of test region (exclusive) used in cross-validation

    Definition Classes
    Classifier