TANBayes

scalation.modeling.classifying.TANBayes
See theTANBayes companion object
class TANBayes(x: MatrixD, y: VectorI, fname_: Array[String], k: Int, cname_: Array[String], var vc: VectorI, hparam: HyperParameter) extends Classifier, BayesClassifier, FitC

Value parameters

cname_

the names of the classes

fname_

the names of the features/variables

hparam

the hyper-parameters

k

the number of classes

vc

the value count (number of distinct values) for each feature

x

the input/data m-by-n matrix

y

the class vector, where y(i) = class for row i of matrix x

Attributes

Companion
object
Graph
Supertypes
trait FitC
trait FitM
trait Classifier
trait Model
class Object
trait Matchable
class Any
Show all

Members list

Type members

Inherited classlikes

case class BestStep(col: Int, qof: VectorD, mod: Classifier)

The BestStep is used to record the best improvement step found so far.

The BestStep is used to record the best improvement step found so far.

Value parameters

col

the column/variable to ADD/REMOVE for this step

mod

the model including selected features/variables for this step

qof

the Quality of Fit (QoF) for this step

Attributes

Inherited from:
Classifier
Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all

Value members

Concrete methods

def cProb_Xpy(x_: MatrixD, y_: VectorI, nu_y: VectorD, nu_Xy: RTensorD, nu_Xpy: Array[Array[MatrixD]]): Array[Array[MatrixD]]

Compute the conditional probability of X given p and y for all xj in X, where p is the the unique x-parent of feature xj.

Compute the conditional probability of X given p and y for all xj in X, where p is the the unique x-parent of feature xj.

Value parameters

nu_Xpy

the joint frequency of X, p and y for all xj in X

nu_Xy

the joint frequency of X and y for all xj in X

nu_y

the class frequency of y

x_

the training/full data/input matrix (defaults to full x)

y_

the training/full response/output vector (defaults to full y)

Attributes

def findParent(x_: MatrixD, y_: VectorI): Unit

Find the best parent xpj for each feature xj based on the CMI matrix. For the Play Tennis example, parent = VectorI (-1, 0, 1, 0).

Find the best parent xpj for each feature xj based on the CMI matrix. For the Play Tennis example, parent = VectorI (-1, 0, 1, 0).

Value parameters

x_

the training/full data/input matrix (defaults to full x)

y_

the training/full response/output vector (defaults to full y)

Attributes

def freq_Xpy(x_: MatrixD, y_: VectorI): Array[Array[MatrixD]]

Compute the joint frequency of xj, xp and y for all xj in X, where p is the unique x-parent of feature xj.

Compute the joint frequency of xj, xp and y for all xj in X, where p is the unique x-parent of feature xj.

Value parameters

x_

the training/full data/input matrix (defaults to full x)

y_

the training/full response/output vector (defaults to full y)

Attributes

override def lpredictI(z: VectorI): Int

Predict the integer value of y = f(z) by computing the product of the class probabilities p_y and all the conditional probabilities P(X_j = z_j | X_p = z_p, y = c) and returning the class with the highest relative probability. This method adds "positive log probabilities" to avoids underflow. Note, p_yz from Classifier holds the relative probabilities of y given z. To recover q relative probability compute 2^(-q) where q is a plog.

Predict the integer value of y = f(z) by computing the product of the class probabilities p_y and all the conditional probabilities P(X_j = z_j | X_p = z_p, y = c) and returning the class with the highest relative probability. This method adds "positive log probabilities" to avoids underflow. Note, p_yz from Classifier holds the relative probabilities of y given z. To recover q relative probability compute 2^(-q) where q is a plog.

Value parameters

z

the new vector to predict

Attributes

Definition Classes
override def predictI(z: VectorI): Int

Predict the integer value of y = f(z) by computing the product of the class probabilities p_y and all the conditional probabilities P(X_j = z_j | X_p = z_p, y = c) and returning the class with the highest relative probability. Note, p_yz from Classifier holds the relative probabilities of y given z.

Predict the integer value of y = f(z) by computing the product of the class probabilities p_y and all the conditional probabilities P(X_j = z_j | X_p = z_p, y = c) and returning the class with the highest relative probability. Note, p_yz from Classifier holds the relative probabilities of y given z.

Value parameters

z

the new vector to predict

Attributes

Definition Classes
inline def predictI(z: VectorD): Int
def printCPTs(): Unit

Print the extended Conditional Probability Tables (eCPTs) by iterating over the features/variables.

Print the extended Conditional Probability Tables (eCPTs) by iterating over the features/variables.

Attributes

def printJFTs(): Unit

Print the extended Joint Frequenct Tables (eJFTs) by iterating over the features/variables.

Print the extended Joint Frequenct Tables (eJFTs) by iterating over the features/variables.

Attributes

override def summary(x_: MatrixD, fname_: Array[String], b_: VectorD, vifs: VectorD): String

Produce a QoF summary for a model with diagnostics for each predictor x_0, x_1, and the overall Quality of Fit (QoF).

Produce a QoF summary for a model with diagnostics for each predictor x_0, x_1, and the overall Quality of Fit (QoF).

Value parameters

b_

the parameters/coefficients for the model

fname_

the array of feature/variable names

vifs

the Variance Inflation Factors (VIFs)

x_

the testing/full data/input matrix

Attributes

Definition Classes
FitC -> FitM
def test(x_: MatrixD, y_: VectorI): (VectorI, VectorD)

Test the predictive model y_ = f(x_) + e and return its predictions and QoF vector. Testing may be in-sample (on the full dataset) or out-of-sample (on the testing set) as determined by the parameters passed in. Note: must call train before test.

Test the predictive model y_ = f(x_) + e and return its predictions and QoF vector. Testing may be in-sample (on the full dataset) or out-of-sample (on the testing set) as determined by the parameters passed in. Note: must call train before test.

Value parameters

x_

the testing/full data/input matrix (defaults to full x)

y_

the testing/full response/output vector (defaults to full y)

Attributes

override def train(x_: MatrixD, y_: VectorI): Unit

Train a classification model y_ = f(x_) + e where x_ is the data/input matrix and y_ is the response/output vector. These arguments default to the full dataset x and y, but may be restricted to a training dataset. Training involves estimating the model parameters or pmf. Train the classifier by computing the probabilities for y, and the conditional probabilities for each x_j.

Train a classification model y_ = f(x_) + e where x_ is the data/input matrix and y_ is the response/output vector. These arguments default to the full dataset x and y, but may be restricted to a training dataset. Training involves estimating the model parameters or pmf. Train the classifier by computing the probabilities for y, and the conditional probabilities for each x_j.

Value parameters

x_

the training/full data/input matrix (defaults to full x)

y_

the training/full response/output vector (defaults to full y)

Attributes

Definition Classes
def vc_parent(): Unit

Determine the value counts for parents. When there is no paremt (-1) set to 1.

Determine the value counts for parents. When there is no paremt (-1) set to 1.

Attributes

Inherited methods

def accuracy: Double

Compute the accuracy of the classification, i.e., the fraction of correct classifications. Note, the correct classifications tp_i are in the main diagonal of the confusion matrix.

Compute the accuracy of the classification, i.e., the fraction of correct classifications. Note, the correct classifications tp_i are in the main diagonal of the confusion matrix.

Attributes

Inherited from:
FitC
def backwardElim(cols: LinkedHashSet[Int], idx_q: Int, first: Int): BestStep

Perform backward elimination to find the least predictive variable to remove from the existing model, returning the variable to eliminate, the new parameter vector and the new Quality of Fit (QoF). May be called repeatedly.

Perform backward elimination to find the least predictive variable to remove from the existing model, returning the variable to eliminate, the new parameter vector and the new Quality of Fit (QoF). May be called repeatedly.

Value parameters

cols

the columns of matrix x currently included in the existing model

first

first variable to consider for elimination (default (1) assume intercept x_0 will be in any model)

idx_q

index of Quality of Fit (QoF) to use for comparing quality

Attributes

See also

Fit for index of QoF measures.

Inherited from:
Classifier
def backwardElimAll(idx_q: Int, first: Int, cross: Boolean): (LinkedHashSet[Int], MatrixD)

Perform backward elimination to find the least predictive variables to remove from the full model, returning the variables left and the new Quality of Fit (QoF) measures for all steps.

Perform backward elimination to find the least predictive variables to remove from the full model, returning the variables left and the new Quality of Fit (QoF) measures for all steps.

Value parameters

cross

whether to include the cross-validation QoF measure

first

first variable to consider for elimination

idx_q

index of Quality of Fit (QoF) to use for comparing quality

Attributes

See also

Fit for index of QoF measures.

Inherited from:
Classifier
def buildModel(x_cols: MatrixD): Classifier

Build a sub-model that is restricted to the given columns of the data matrix. Override for models that support feature selection.

Build a sub-model that is restricted to the given columns of the data matrix. Override for models that support feature selection.

Value parameters

x_cols

the columns that the new model is restricted to

Attributes

Inherited from:
Classifier
def classify(z: VectorD): (Int, String, Double)

Given a continuous data vector z, classify it returning the class number (0, ..., k-1) with the highest relative posterior probability. Return the best class, its name and its relative probability.

Given a continuous data vector z, classify it returning the class number (0, ..., k-1) with the highest relative posterior probability. Return the best class, its name and its relative probability.

Value parameters

z

the data vector to classify

Attributes

Inherited from:
Classifier
def classify(z: VectorI): (Int, String, Double)

Given a discrete data vector z, classify it returning the class number (0, ..., k-1) with the highest relative posterior probability. Return the best class, its name and its relative probability.

Given a discrete data vector z, classify it returning the class number (0, ..., k-1) with the highest relative posterior probability. Return the best class, its name and its relative probability.

Value parameters

z

the data vector to classify

Attributes

Inherited from:
Classifier
def clearConfusion(): Unit

Clear the total cummulative confusion matrix.

Clear the total cummulative confusion matrix.

Attributes

Inherited from:
FitC
def cmi(x: VectorI, z: VectorI, vcxz: VectorI, y: VectorI): Double

Calculate the Conditional Mutual Information (CMI) of data vectors x and z, given response/classification vector y, i.e., I(x; z | y).

Calculate the Conditional Mutual Information (CMI) of data vectors x and z, given response/classification vector y, i.e., I(x; z | y).

Value parameters

vcxz

the vector of value counts (number of distinct values for x, z)

x

the first integer-valued data vector

y

the class vector, where y(i) = class

z

the second integer-valued data vector

Attributes

See also

en.wikipedia.org/wiki/Conditional_mutual_information

Inherited from:
BayesClassifier

Calculate the Conditional Mutual Information (CMI) matrix for data matrix x given response/classification vector y, i.e., I(xj; xl | y) for all pairs of features/columns xj and xl in matrix x.

Calculate the Conditional Mutual Information (CMI) matrix for data matrix x given response/classification vector y, i.e., I(xj; xl | y) for all pairs of features/columns xj and xl in matrix x.

Value parameters

vc

the vector of value counts (number of distinct values per feature)

x

the integer-valued data vectors stored as columns of a matrix

y

the class vector, where y(i) = class for row i of the matrix x, x(i)

Attributes

See also

en.wikipedia.org/wiki/Conditional_mutual_information

Inherited from:
BayesClassifier
def confusion(y_: VectorI, yp: VectorI): MatrixI

Compare the actual class y vector versus the predicted class yp vector, returning the confusion matrix cmat, which for k = 2 is yp 0 1 ---------- y 0 | tn fp | 1 | fn tp | ---------- Note: ScalaTion's confusion matrix is Actual × Predicted, but to swap the position of actual y (rows) with predicted yp (columns) simply use cmat.transpose, the transpose of cmat.

Compare the actual class y vector versus the predicted class yp vector, returning the confusion matrix cmat, which for k = 2 is yp 0 1 ---------- y 0 | tn fp | 1 | fn tp | ---------- Note: ScalaTion's confusion matrix is Actual × Predicted, but to swap the position of actual y (rows) with predicted yp (columns) simply use cmat.transpose, the transpose of cmat.

Value parameters

y_

the actual class values/labels for full (y) or test (y_e) dataset

yp

the predicted class values/labels

Attributes

See also
Inherited from:
FitC
def contrast(y_: VectorI, yp: VectorI): Unit

Contract the actual class y_ vector versus the predicted class yp vector.

Contract the actual class y_ vector versus the predicted class yp vector.

Value parameters

y_

the actual class values/labels for full (y) or test (y_e) dataset

yp

the predicted class values/labels

Attributes

Inherited from:
FitC
def crossValidate(k: Int, rando: Boolean): Array[Statistic]

Attributes

Inherited from:
Classifier
def diagnose(y_: VectorI, yp: VectorI): VectorD

Diagnose the health of the model by computing the Quality of Fit (QoF) measures, from the error/residual vector and the predicted & actual responses. Requires the actual and predicted responses to be non-negative integers. Must override when there negative responses.

Diagnose the health of the model by computing the Quality of Fit (QoF) measures, from the error/residual vector and the predicted & actual responses. Requires the actual and predicted responses to be non-negative integers. Must override when there negative responses.

Value parameters

y_

the actual response/output vector to use (test/full)

yp

the predicted response/output vector (test/full)

Attributes

Inherited from:
FitC
override def diagnose(y_: VectorD, yp: VectorD, w: VectorD): VectorD

Diagnose the health of the model by computing the Quality of Fit (QoF) measures, from the error/residual vector and the predicted & actual responses. For some models the instances may be weighted.

Diagnose the health of the model by computing the Quality of Fit (QoF) measures, from the error/residual vector and the predicted & actual responses. For some models the instances may be weighted.

Value parameters

w

the weights on the instances (defaults to null)

y_

the actual response/output vector to use (test/full)

yp

the predicted response/output vector (test/full)

Attributes

Definition Classes
FitC -> FitM
Inherited from:
FitC
def f1_measure(p: Double, r: Double): Double

Compute the F1-measure, i.e., the harmonic mean of the precision and recall.

Compute the F1-measure, i.e., the harmonic mean of the precision and recall.

Value parameters

p

the precision

r

the recall

Attributes

Inherited from:
FitC
def f1v: VectorD

Compute the micro-F1-measure vector, i.e., the harmonic mean of the precision and recall.

Compute the micro-F1-measure vector, i.e., the harmonic mean of the precision and recall.

Attributes

Inherited from:
FitC
def fit: VectorD

Return the Quality of Fit (QoF) measures corresponding to the labels given above in the fitLabel method.

Return the Quality of Fit (QoF) measures corresponding to the labels given above in the fitLabel method.

Attributes

Inherited from:
FitC
def fitLabel_v: Seq[String]

Return the labels for the Quality of Fit (QoF) measures. Override to add additional QoF measures.

Return the labels for the Quality of Fit (QoF) measures. Override to add additional QoF measures.

Attributes

Inherited from:
FitC
def fitMicroMap: Map[String, VectorD]

Return the Quality of Fit (QoF) vector micro-measures, i.e., measures for each class.

Return the Quality of Fit (QoF) vector micro-measures, i.e., measures for each class.

Attributes

Inherited from:
FitC
def forwardSel(cols: LinkedHashSet[Int], idx_q: Int): BestStep

Perform forward selection to find the most predictive variable to add the existing model, returning the variable to add and the new model. May be called repeatedly.

Perform forward selection to find the most predictive variable to add the existing model, returning the variable to add and the new model. May be called repeatedly.

Value parameters

cols

the columns of matrix x currently included in the existing model

idx_q

index of Quality of Fit (QoF) to use for comparing quality

Attributes

See also

Fit for index of QoF measures.

Inherited from:
Classifier
def forwardSelAll(idx_q: Int, cross: Boolean): (LinkedHashSet[Int], MatrixD)

Perform forward selection to find the most predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.

Perform forward selection to find the most predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.

Value parameters

cross

whether to include the cross-validation QoF measure

idx_q

index of Quality of Fit (QoF) to use for comparing quality

Attributes

See also

Fit for index of QoF measures.

Inherited from:
Classifier
def getFname: Array[String]

Return the feature/variable names.

Return the feature/variable names.

Attributes

Inherited from:
Classifier
def getX: MatrixD

Return the used data matrix x. Mainly for derived classes where x is expanded from the given columns in x_, e.g., SymbolicRegression.quadratic adds squared columns.

Return the used data matrix x. Mainly for derived classes where x is expanded from the given columns in x_, e.g., SymbolicRegression.quadratic adds squared columns.

Attributes

Inherited from:
Classifier
def getY: VectorI

Return the used response vector y. Mainly for derived classes where y is transformed, e.g., TranRegression, Regression4TS.

Return the used response vector y. Mainly for derived classes where y is transformed, e.g., TranRegression, Regression4TS.

Attributes

Inherited from:
Classifier
def help: String

Return the help string that describes the Quality of Fit (QoF) measures provided by the FitC class. Override to correspond to fitLabel.

Return the help string that describes the Quality of Fit (QoF) measures provided by the FitC class. Override to correspond to fitLabel.

Attributes

Inherited from:
FitC

Return the hyper-parameters.

Return the hyper-parameters.

Attributes

Inherited from:
Classifier
def jProbXY(x: VectorI, vcx: Int, y: VectorI): MatrixD

Compute the joint probability of x and y and return it as a matrix.

Compute the joint probability of x and y and return it as a matrix.

Value parameters

vcx

the value count for x (number of distinct values for x)

x

the integer-valued data vectors stored as columns of a matrix

y

the class vector, where y(i) = class

Attributes

Inherited from:
BayesClassifier
def jProbXZY(x: VectorI, z: VectorI, vcxz: VectorI, y: VectorI): RTensorD

Compute the joint probability of x, z and y and return it as a tensor.

Compute the joint probability of x, z and y and return it as a tensor.

Value parameters

vcxz

the vector of value counts (number of distinct values for x, z)

x

the first integer-valued data vector

y

the class vector, where y(i) = class

z

the second integer-valued data vector

Attributes

Inherited from:
BayesClassifier
def kappaf(y_: VectorI, yp: VectorI): Double

Compute Cohen's kappa coefficient that measures agreement between actual y and predicted yp classifications.

Compute Cohen's kappa coefficient that measures agreement between actual y and predicted yp classifications.

Value parameters

y_

the actual response/output vector to use (test/full)

yp

the predicted response/output vector (test/full)

Attributes

See also

en.wikipedia.org/wiki/Cohen%27s_kappa

Inherited from:
FitC
def lclassify(z: VectorD): (Int, String, Double)

Given a continuous data vector z, classify it returning the class number (0, ..., k-1) with the highest relative posterior probability. Return the best class, its name and its relative log-probability. This method adds "positive log probabilities" to avoids underflow. To recover q relative probability compute 2^(-q) where q is a plog.

Given a continuous data vector z, classify it returning the class number (0, ..., k-1) with the highest relative posterior probability. Return the best class, its name and its relative log-probability. This method adds "positive log probabilities" to avoids underflow. To recover q relative probability compute 2^(-q) where q is a plog.

Value parameters

z

the data vector to classify

Attributes

Inherited from:
Classifier
def lclassify(z: VectorI): (Int, String, Double)

Given a discrete data vector z, classify it returning the class number (0, ..., k-1) with the highest relative posterior probability. Return the best class, its name and its relative log-probability. This method adds "positive log probabilities" to avoids underflow. To recover q relative probability compute 2^(-q) where q is a plog.

Given a discrete data vector z, classify it returning the class number (0, ..., k-1) with the highest relative posterior probability. Return the best class, its name and its relative log-probability. This method adds "positive log probabilities" to avoids underflow. To recover q relative probability compute 2^(-q) where q is a plog.

Value parameters

z

the data vector to classify

Attributes

Inherited from:
Classifier
def lpredictI(z: VectorD): Int

Attributes

Inherited from:
Classifier
def numTerms: Int

Return the number of terms/parameters in the model, e.g., b_0 + b_1 x_1 + b_2 x_2 has three terms.

Return the number of terms/parameters in the model, e.g., b_0 + b_1 x_1 + b_2 x_2 has three terms.

Attributes

Inherited from:
Classifier
def p_r_s(): Unit

Compute the micro-precision, micro-recall and micro-specificity vectors which have elements for each class i in {0, 1, ... k-1}. Precision is the fraction classified as true that are actually true. Recall (sensitivity) is the fraction of the actually true that are classified as true. Specificity is the fraction of the actually false that are classified as false. Note, for k = 2, ordinary precision p, recall r and specificity s will correspond to the last elements in the pv, rv and sv micro vectors.

Compute the micro-precision, micro-recall and micro-specificity vectors which have elements for each class i in {0, 1, ... k-1}. Precision is the fraction classified as true that are actually true. Recall (sensitivity) is the fraction of the actually true that are classified as true. Specificity is the fraction of the actually false that are classified as false. Note, for k = 2, ordinary precision p, recall r and specificity s will correspond to the last elements in the pv, rv and sv micro vectors.

Attributes

Inherited from:
FitC

Return the vector of parameter values analog, the estimate of the response pmf.

Return the vector of parameter values analog, the estimate of the response pmf.

Attributes

Inherited from:
Classifier
def predict(z: VectorD): Double

Predict the value of y = f(z) by evaluating the model equation. Single output models return Double, while multi-output models return VectorD.

Predict the value of y = f(z) by evaluating the model equation. Single output models return Double, while multi-output models return VectorD.

Value parameters

z

the new vector to predict

Attributes

Inherited from:
Classifier

Predict the value of vector y = f(x_) using matrix x_

Predict the value of vector y = f(x_) using matrix x_

Value parameters

x_

the matrix to use for making predictions, one for each row

Attributes

Inherited from:
Classifier
def pseudo_rSq: Double

Compute the Efron's pseudo R-squared value. Override to McFadden's, etc.

Compute the Efron's pseudo R-squared value. Override to McFadden's, etc.

Value parameters

p1

the first parameter

p2

the second parameter

Attributes

Inherited from:
FitC
def rSq0_: Double

Attributes

Inherited from:
FitM
def rSq_: Double

Return the coefficient of determination (R^2). Must call diagnose first.

Return the coefficient of determination (R^2). Must call diagnose first.

Attributes

Inherited from:
FitM
override def report(ftVec: VectorD): String

Return a basic report on a trained and tested model.

Return a basic report on a trained and tested model.

Value parameters

ftVec

the vector of qof values produced by the FitC trait

Attributes

Definition Classes
Inherited from:
Classifier
def report(ftMat: MatrixD): String

Return a basic report on a trained and tested multi-variate model.

Return a basic report on a trained and tested multi-variate model.

Value parameters

ftMat

the matrix of qof values produced by the Fit trait

Attributes

Inherited from:
Model

Return the vector of residuals/errors.

Return the vector of residuals/errors.

Attributes

Inherited from:
Classifier
def selectFeatures(tech: SelectionTech, idx_q: Int, cross: Boolean): (LinkedHashSet[Int], MatrixD)

Perform feature selection to find the most predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.

Perform feature selection to find the most predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.

Value parameters

cross

whether to include the cross-validation QoF measure

idx_q

index of Quality of Fit (QoF) to use for comparing quality

tech

the feature selection technique to apply

Attributes

See also

Fit for index of QoF measures.

Inherited from:
Classifier
def sse_: Double

Return the sum of the squares for error (sse). Must call diagnose first.

Return the sum of the squares for error (sse). Must call diagnose first.

Attributes

Inherited from:
FitM
def stepRegressionAll(idx_q: Int, cross: Boolean): (LinkedHashSet[Int], MatrixD)

Perform stepwise regression to find the most predictive variables to have in the model, returning the variables left and the new Quality of Fit (QoF) measures for all steps. At each step it calls forwardSel and backwardElim and takes the best of the two actions. Stops when neither action yields improvement.

Perform stepwise regression to find the most predictive variables to have in the model, returning the variables left and the new Quality of Fit (QoF) measures for all steps. At each step it calls forwardSel and backwardElim and takes the best of the two actions. Stops when neither action yields improvement.

Value parameters

cross

whether to include the cross-validation QoF measure

idx_q

index of Quality of Fit (QoF) to use for comparing quality

Attributes

See also

Fit for index of QoF measures.

Inherited from:
Classifier
def test(x_: MatrixD, y_: VectorD): (VectorD, VectorD)

Test/evaluate the model's Quality of Fit (QoF) and return the predictions and QoF vectors. This may include the importance of its parameters (e.g., if 0 is in a parameter's confidence interval, it is a candidate for removal from the model). Extending traits and classess should implement various diagnostics for the test and full (training + test) datasets.

Test/evaluate the model's Quality of Fit (QoF) and return the predictions and QoF vectors. This may include the importance of its parameters (e.g., if 0 is in a parameter's confidence interval, it is a candidate for removal from the model). Extending traits and classess should implement various diagnostics for the test and full (training + test) datasets.

Value parameters

x_

the testiing/full data/input matrix (impl. classes may default to x)

y_

the testiing/full response/output vector (impl. classes may default to y)

Attributes

Inherited from:
Classifier
inline def testIndices(n_test: Int, rando: Boolean): IndexedSeq[Int]

Return the indices for the test-set.

Return the indices for the test-set.

Value parameters

n_test

the size of test-set

rando

whether to select indices randomly or in blocks

Attributes

See also

scalation.mathstat.TnT_Split

Inherited from:
Classifier
def tn_fp_fn_tp(con: MatrixI): (Double, Double, Double, Double)

Return the confusion matrix for k = 2 as a tuple (tn, fp, fn, tp).

Return the confusion matrix for k = 2 as a tuple (tn, fp, fn, tp).

Value parameters

con

the confusion matrix (defaults to cmat)

Attributes

Inherited from:
FitC

Return a copy of the total cumulative confusion matrix tcmat and clear tcmat.

Return a copy of the total cumulative confusion matrix tcmat and clear tcmat.

Attributes

Inherited from:
FitC
def train(x_: MatrixD, y_: VectorD): Unit

Train the model 'y_ = f(x_) + e' on a given dataset, by optimizing the model parameters in order to minimize error '||e||' or maximize log-likelihood 'll'.

Train the model 'y_ = f(x_) + e' on a given dataset, by optimizing the model parameters in order to minimize error '||e||' or maximize log-likelihood 'll'.

Value parameters

x_

the training/full data/input matrix (impl. classes may default to x)

y_

the training/full response/output vector (impl. classes may default to y)

Attributes

Inherited from:
Classifier
def train2(x_: MatrixD, y_: VectorI): Unit

The train2 method should work like the train method, but should also optimize hyper-parameters (e.g., shrinkage or learning rate). Only implementing classes needing this capability should override this method.

The train2 method should work like the train method, but should also optimize hyper-parameters (e.g., shrinkage or learning rate). Only implementing classes needing this capability should override this method.

Value parameters

x_

the training/full data/input matrix (defaults to full x)

y_

the training/full response/output vector (defaults to full y)

Attributes

Inherited from:
Classifier
def trainNtest(x_: MatrixD, y_: VectorI)(xx: MatrixD, yy: VectorI): (VectorI, VectorD)

Train and test the predictive model y_ = f(x_) + e and report its QoF and plot its predictions.

Train and test the predictive model y_ = f(x_) + e and report its QoF and plot its predictions.

Value parameters

x_

the training/full data/input matrix (defaults to full x)

xx

the testing/full data/input matrix (defaults to full x)

y_

the training/full response/output vector (defaults to full y)

yy

the testing/full response/output vector (defaults to full y)

Attributes

Inherited from:
Classifier
def validate(rando: Boolean, ratio: Double)(idx: IndexedSeq[Int]): VectorD

Attributes

Inherited from:
Classifier
def vif(skip: Int): VectorD

Compute the Variance Inflation Factor (VIF) for each variable to test for multi-collinearity by regressing x_j against the rest of the variables. A VIF over 50 indicates that over 98% of the variance of x_j can be predicted from the other variables, so x_j may be a candidate for removal from the model. Note: override this method to use a superior regression technique.

Compute the Variance Inflation Factor (VIF) for each variable to test for multi-collinearity by regressing x_j against the rest of the variables. A VIF over 50 indicates that over 98% of the variance of x_j can be predicted from the other variables, so x_j may be a candidate for removal from the model. Note: override this method to use a superior regression technique.

Value parameters

skip

the number of columns of x at the beginning to skip in computing VIF

Attributes

Inherited from:
Classifier

Inherited fields

var modelConcept: URI

The optional reference to an ontological concept

The optional reference to an ontological concept

Attributes

Inherited from:
Model
var modelName: String

The name for the model (or modeling technique).

The name for the model (or modeling technique).

Attributes

Inherited from:
Model