Packages

trait Classifier extends AnyRef

The Classifier trait provides a common framework for several classifiers. A classifier is for bounded responses. When the number of distinct responses cannot be bounded by some integer 'k', a predictor should be used.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Classifier
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Abstract Value Members

  1. abstract def classify(z: VectoD): (Int, String, Double)

    Given a new continuous data vector 'z', determine which class it fits into, returning the best class, its name and its relative probability.

    Given a new continuous data vector 'z', determine which class it fits into, returning the best class, its name and its relative probability.

    z

    the real vector to classify

  2. abstract def classify(z: VectoI): (Int, String, Double)

    Given a new discrete data vector 'z', determine which class it fits into, returning the best class, its name and its relative probability.

    Given a new discrete data vector 'z', determine which class it fits into, returning the best class, its name and its relative probability.

    z

    the integer vector to classify

  3. abstract def reset(): Unit

    Reset the frequency counters.

  4. abstract def size: Int

    Return the number of data vectors/points in the entire dataset (training + testing),

  5. abstract def test(itest: IndexedSeq[Int]): Double

    Test the quality of the training with a test dataset and return the fraction of correct classifications.

    Test the quality of the training with a test dataset and return the fraction of correct classifications.

    itest

    the indices of the instances considered test data

  6. abstract def train(itest: IndexedSeq[Int]): Classifier

    Train the classifier by computing the probabilities from a training dataset of data vectors and their classifications.

    Train the classifier by computing the probabilities from a training dataset of data vectors and their classifications. The indices for the testing dataset are given and the training dataset consists of all the other instances. Must be implemented in any extending class.

    itest

    the indices of the instances considered as testing data

Concrete Value Members

  1. def crossValidate(nx: Int = 10, show: Boolean = false): Double

    Test the accuracy of the classified results by cross-validation, returning the accuracy.

    Test the accuracy of the classified results by cross-validation, returning the accuracy. The "test data" starts at 'testStart' and ends at 'testEnd', the rest of the data is "training data'. FIX - should return a StatVector

    nx

    the number of crosses and cross-validations (defaults to 10x).

    show

    the show flag (show result from each iteration)

  2. def crossValidateRand(nx: Int = 10, show: Boolean = false): Double

    Test the accuracy of the classified results by cross-validation, returning the accuracy.

    Test the accuracy of the classified results by cross-validation, returning the accuracy. This version of cross-validation relies on "subtracting" frequencies from the previously stored global data to achieve efficiency. FIX - are the comments correct? FIX - should return a StatVector

    nx

    number of crosses and cross-validations (defaults to 10x).

    show

    the show flag (show result from each iteration)

  3. def fit(y: VectoI, yp: VectoI, k: Int = 2): VectoD

    Return the quality of fit including 'acc', 'prec', 'recall', 'kappa'.

    Return the quality of fit including 'acc', 'prec', 'recall', 'kappa'. Override to add more quality of fit measures.

    y

    the actual class labels

    yp

    the precicted class labels

    k

    the number of class labels

    See also

    ConfusionMat

    medium.com/greyatom/performance-metrics-for-classification-problems-in-machine-learning-part-i-b085d432082b

  4. def fitLabel: Seq[String]

    Return the labels for the fit.

    Return the labels for the fit. Override when necessary.

  5. def test(testStart: Int, testEnd: Int): Double

    Test the quality of the training with a test dataset and return the fraction of correct classifications.

    Test the quality of the training with a test dataset and return the fraction of correct classifications. Can be used when the dataset is randomized so that the testing/training part of a dataset corresponds to simple slices of vectors and matrices.

    testStart

    the beginning of test region (inclusive).

    testEnd

    the end of test region (exclusive).

  6. def train(): Classifier

    Train the classifier by computing the probabilities from a training dataset of data vectors and their classifications.

    Train the classifier by computing the probabilities from a training dataset of data vectors and their classifications. Must be implemented in any extending class. Can be used when the whole dataset is used for training.

  7. def train(testStart: Int, testEnd: Int): Classifier

    Train the classifier by computing the probabilities from a training dataset of data vectors and their classifications.

    Train the classifier by computing the probabilities from a training dataset of data vectors and their classifications. Must be implemented in any extending class. Can be used when the dataset is randomized so that the training part of a dataset corresponds to simple slices of vectors and matrices.

    testStart

    starting index of test region (inclusive) used in cross-validation

    testEnd

    ending index of test region (exclusive) used in cross-validation