Packages

class RegressionTree extends PredictorMat

The RegressionTree class implements a RegressionTree selecting splitting features using minimal variance in children nodes. To avoid exponential choices in the selection, supporting ordinal features currently. Use companion object is recommended for generate Regression Tree.

Linear Supertypes
PredictorMat, Predictor, Model, Fit, Error, QoF, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. RegressionTree
  2. PredictorMat
  3. Predictor
  4. Model
  5. Fit
  6. Error
  7. QoF
  8. AnyRef
  9. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new RegressionTree(x: MatriD, y: VectoD, fname_: Strings = null, hparam: HyperParameter = RegressionTree.hp, curDepth: Int = -1, branchValue: Int = -1, feature: Int = -1)

    x

    the data vectors stored as rows of a matrix

    y

    the response vector

    fname_

    the names of the model's features/variables

    hparam

    the hyper-parameters for the model

    curDepth

    current depth

    branchValue

    the branchValue for the tree node

    feature

    the feature for the tree's parent node

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. def analyze(x_: MatriD = x, y_: VectoD = y, x_e: MatriD = x, y_e: VectoD = y): PredictorMat

    Analyze a dataset using this model using ordinary training with the 'train' method.

    Analyze a dataset using this model using ordinary training with the 'train' method.

    x_

    the training/full data/input matrix

    y_

    the training/full response/output vector

    x_e

    the test/full data/input matrix

    y_e

    the test/full response/output vector

    Definition Classes
    PredictorMatPredictor
  5. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  6. var b: VectoD
    Attributes
    protected
    Definition Classes
    PredictorMat
  7. def backwardElim(cols: Set[Int], index_q: Int = index_rSqBar, first: Int = 1): (Int, PredictorMat)

    Perform backward elimination to find the least predictive variable to remove from the existing model, returning the variable to eliminate, the new parameter vector and the new Quality of Fit (QoF).

    Perform backward elimination to find the least predictive variable to remove from the existing model, returning the variable to eliminate, the new parameter vector and the new Quality of Fit (QoF). May be called repeatedly.

    cols

    the columns of matrix x currently included in the existing model

    index_q

    index of Quality of Fit (QoF) to use for comparing quality

    first

    first variable to consider for elimination (default (1) assume intercept x_0 will be in any model)

    Definition Classes
    PredictorMat
    See also

    Fit for index of QoF measures.

  8. def backwardElimAll(index_q: Int = index_rSqBar, first: Int = 1, cross: Boolean = true): (Set[Int], MatriD)

    Perform backward elimination to find the least predictive variables to remove from the full model, returning the variables left and the new Quality of Fit (QoF) measures for all steps.

    Perform backward elimination to find the least predictive variables to remove from the full model, returning the variables left and the new Quality of Fit (QoF) measures for all steps.

    index_q

    index of Quality of Fit (QoF) to use for comparing quality

    first

    first variable to consider for elimination

    cross

    whether to include the cross-validation QoF measure

    Definition Classes
    PredictorMat
    See also

    Fit for index of QoF measures.

  9. def buildModel(x_cols: MatriD): RegressionTree

    Build a sub-model that is restricted to the given columns of the data matrix.

    Build a sub-model that is restricted to the given columns of the data matrix.

    x_cols

    the columns that the new model is restricted to

    Definition Classes
    RegressionTreePredictorMat
  10. def buildTree(opt: (Int, Double)): Unit

    Given the next most distinguishing feature/attribute, extend the regression tree.

    Given the next most distinguishing feature/attribute, extend the regression tree.

    opt

    the optimal feature and the variance

  11. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @HotSpotIntrinsicCandidate()
  12. def corrMatrix(xx: MatriD = x): MatriD

    Return the correlation matrix for the columns in data matrix 'xx'.

    Return the correlation matrix for the columns in data matrix 'xx'.

    xx

    the data matrix shose correlation matrix is sought

    Definition Classes
    PredictorMatPredictor
  13. def crossValidate(k: Int = 10, rando: Boolean = true): Array[Statistic]
    Definition Classes
    PredictorMat
  14. def diagnose(e: VectoD, yy: VectoD, yp: VectoD, w: VectoD = null, ym_: Double = noDouble): Unit

    Diagnose the health of the model by computing the Quality of Fit (QoF) measures, from the error/residual vector and the predicted & actual responses.

    Diagnose the health of the model by computing the Quality of Fit (QoF) measures, from the error/residual vector and the predicted & actual responses. For some models the instances may be weighted.

    e

    the m-dimensional error/residual vector (yy - yp)

    yy

    the actual response/output vector to use (test/full)

    yp

    the predicted response/output vector (test/full)

    w

    the weights on the instances (defaults to null)

    ym_

    the mean of the actual response/output vector to use (training/full)

    Definition Classes
    FitQoF
    See also

    Regression_WLS

  15. var e: VectoD
    Attributes
    protected
    Definition Classes
    PredictorMat
  16. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  17. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  18. def eval(x_e: MatriD = x, y_e: VectoD = y): RegressionTree

    Compute the error and useful diagnostics.

    Compute the error and useful diagnostics. Requires overriding for updating the degrees of freedom.

    x_e

    the data matrix used in prediction

    y_e

    the actual response vector

    Definition Classes
    RegressionTreePredictorMatModel
  19. def eval(ym: Double, y_e: VectoD, yp: VectoD): PredictorMat

    Compute the error (difference between actual and predicted) and useful diagnostics for the test dataset.

    Compute the error (difference between actual and predicted) and useful diagnostics for the test dataset. Requires predicted responses to be passed in.

    ym

    the training/full mean actual response/output vector

    y_e

    the test/full actual response/output vector

    yp

    the test/full predicted response/output vector

    Definition Classes
    PredictorMat
  20. def f_(z: Double): String

    Format a double value.

    Format a double value.

    z

    the double value to format

    Definition Classes
    QoF
  21. def fastThreshold(f: Int, x_f: VectoD, subSample: VectoI = null): Unit

    Given feature 'f', use fast threshold selection to find an optimal threshold/ split point in O(NlogN) time.

    Given feature 'f', use fast threshold selection to find an optimal threshold/ split point in O(NlogN) time.

    f

    the given feature for which the threshold is desired

    x_f

    column f in data matrix

    subSample

    optional, use to select from the range

    See also

    people.cs.umass.edu/~domke/courses/sml/12trees.pdf

  22. def fit: VectoD

    Return the Quality of Fit (QoF) measures corresponding to the labels given above in the 'fitLabel' method.

    Return the Quality of Fit (QoF) measures corresponding to the labels given above in the 'fitLabel' method. Note, if 'sse > sst', the model introduces errors and the 'rSq' may be negative, otherwise, R^2 ('rSq') ranges from 0 (weak) to 1 (strong). Override to add more quality of fit measures.

    Definition Classes
    FitQoF
  23. def fitLabel: Seq[String]

    Return the labels for the Quality of Fit (QoF) measures.

    Return the labels for the Quality of Fit (QoF) measures. Override to add additional QoF measures.

    Definition Classes
    FitQoF
  24. def fitMap: Map[String, String]

    Build a map of quality of fit measures (use of LinkedHashMap makes it ordered).

    Build a map of quality of fit measures (use of LinkedHashMap makes it ordered).

    Definition Classes
    QoF
  25. final def flaw(method: String, message: String): Unit
    Definition Classes
    Error
  26. var fname: Strings
    Attributes
    protected
    Definition Classes
    PredictorMat
  27. def forwardSel(cols: Set[Int], index_q: Int = index_rSqBar): (Int, PredictorMat)

    Perform forward selection to find the most predictive variable to add the existing model, returning the variable to add and the new model.

    Perform forward selection to find the most predictive variable to add the existing model, returning the variable to add and the new model. May be called repeatedly.

    cols

    the columns of matrix x currently included in the existing model

    index_q

    index of Quality of Fit (QoF) to use for comparing quality

    Definition Classes
    PredictorMatPredictor
    See also

    Fit for index of QoF measures.

  28. def forwardSelAll(index_q: Int = index_rSqBar, cross: Boolean = true): (Set[Int], MatriD)

    Perform forward selection to find the most predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.

    Perform forward selection to find the most predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.

    index_q

    index of Quality of Fit (QoF) to use for comparing quality

    cross

    whether to include the cross-validation QoF measure

    Definition Classes
    PredictorMat
    See also

    Fit for index of QoF measures.

  29. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  30. def getX: MatriD

    Return the 'used' data matrix 'x'.

    Return the 'used' data matrix 'x'. Mainly for derived classes where 'x' is expanded from the given columns in 'x_', e.g., QuadRegression add squared columns.

    Definition Classes
    PredictorMatPredictor
  31. def getY: VectoD

    Return the 'used' response vector 'y'.

    Return the 'used' response vector 'y'. Mainly for derived classes where 'y' is transformed, e.g., TranRegression, Regression4TS.

    Definition Classes
    PredictorMatPredictor
  32. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  33. def help: String

    Return the help string that describes the Quality of Fit (QoF) measures provided by the Fit class.

    Return the help string that describes the Quality of Fit (QoF) measures provided by the Fit class. Override to correspond to 'fitLabel'.

    Definition Classes
    FitQoF
  34. def hparameter: HyperParameter

    Return the hyper-parameters.

    Return the hyper-parameters.

    Definition Classes
    PredictorMatModel
  35. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  36. val k: Int
    Attributes
    protected
    Definition Classes
    PredictorMat
  37. def ll(ms: Double = mse0, s2: Double = sig2e, m2: Int = m): Double

    The log-likelihood function times -2.

    The log-likelihood function times -2. Override as needed.

    ms

    raw Mean Squared Error

    s2

    MLE estimate of the population variance of the residuals

    Definition Classes
    Fit
    See also

    www.stat.cmu.edu/~cshalizi/mreg/15/lectures/06/lecture-06.pdf

    www.wiley.com/en-us/Introduction+to+Linear+Regression+Analysis%2C+5th+Edition-p-9780470542811 Section 2.11

  38. val m: Int
    Attributes
    protected
    Definition Classes
    PredictorMat
  39. val modelConcept: URI

    An optional reference to an ontological concept

    An optional reference to an ontological concept

    Definition Classes
    Model
  40. def modelName: String

    An optional name for the model (or modeling technique)

    An optional name for the model (or modeling technique)

    Definition Classes
    Model
  41. def mse_: Double

    Return the mean of squares for error (sse / df._2).

    Return the mean of squares for error (sse / df._2). Must call diagnose first.

    Definition Classes
    Fit
  42. val n: Int
    Attributes
    protected
    Definition Classes
    PredictorMat
  43. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  44. def nextXY(f: Int, side: Int): (MatriD, VectoD)

    Return new 'x' matrix and 'y' vector for next step of constructing regression tree.

    Return new 'x' matrix and 'y' vector for next step of constructing regression tree.

    f

    the feature index

    side

    indicator for which side of child is chosen (i.e., 0 for left child)

  45. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  46. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  47. def parameter: VectoD

    Return the vector of parameter/coefficient values.

    Return the vector of parameter/coefficient values.

    Definition Classes
    PredictorMatModel
  48. def predict(z: MatriD = x): VectorD

    Given a data matrix z, predict the value by following the tree to the leaf.

    Given a data matrix z, predict the value by following the tree to the leaf.

    z

    the data matrix to predict

    Definition Classes
    RegressionTreePredictorMatPredictor
  49. def predict(z: VectoD): Double

    Given a data vector z, predict the value by following the tree to the leaf.

    Given a data vector z, predict the value by following the tree to the leaf.

    z

    the data vector to predict

    Definition Classes
    RegressionTreePredictorMatPredictor
  50. def predict(z: VectoI): Double

    Given a new discrete data/input vector 'z', predict the 'y'-value of 'f(z)'.

    Given a new discrete data/input vector 'z', predict the 'y'-value of 'f(z)'.

    z

    the vector to use for prediction

    Definition Classes
    Predictor
  51. def printT(nod: Node, level: Int): Unit
  52. def printTree(): Unit

    Print the regression tree in Pre-Order using 'printT' method.

  53. def printTree2(): Unit

    Print out the regression tree using Breadth First Search (BFS).

  54. def report: String

    Return a basic report on the trained model.

    Return a basic report on the trained model.

    Definition Classes
    PredictorMatModel
    See also

    'summary' method for more details

  55. def reset(): Unit

    Reset or re-initialize the frequency tables and the probability tables.

  56. def resetDF(df_update: PairD): Unit

    Reset the degrees of freedom to the new updated values.

    Reset the degrees of freedom to the new updated values. For some models, the degrees of freedom is not known until after the model is built.

    df_update

    the updated degrees of freedom (model, error)

    Definition Classes
    Fit
  57. def residual: VectoD

    Return the vector of residuals/errors.

    Return the vector of residuals/errors.

    Definition Classes
    PredictorMatPredictor
  58. def reverse(a: MatriD): MatriD

    Return a matrix that is in reverse row order of the given matrix 'a'.

    Return a matrix that is in reverse row order of the given matrix 'a'.

    a

    the given matrix

    Definition Classes
    PredictorMat
  59. var sig2e: Double
    Attributes
    protected
    Definition Classes
    Fit
  60. def split(f: Int, thresh: Double): (Array[Int], Array[Int])

    Split gives index of left and right child when spliting in 'thresh'.

    Split gives index of left and right child when spliting in 'thresh'.

    f

    the feature indicator

    thresh

    threshold

  61. def stepRegressionAll(index_q: Int = index_rSqBar, cross: Boolean = true): (Set[Int], MatriD)

    Perform stepwise regression to find the most predictive variables to have in the model, returning the variables left and the new Quality of Fit (QoF) measures for all steps.

    Perform stepwise regression to find the most predictive variables to have in the model, returning the variables left and the new Quality of Fit (QoF) measures for all steps. At each step it calls 'forwardSel' and 'backwardElim' and takes the best of the two actions. Stops when neither action yields improvement.

    index_q

    index of Quality of Fit (QoF) to use for comparing quality

    cross

    whether to include the cross-validation QoF measure

    Definition Classes
    PredictorMat
    See also

    Fit for index of QoF measures.

  62. def summary: String

    Compute and return summary diagostics for the regression model.

    Compute and return summary diagostics for the regression model.

    Definition Classes
    PredictorMat
  63. def summary(b: VectoD, stdErr: VectoD, vf: VectoD, show: Boolean = false): String

    Produce a summary report with diagnostics for each predictor 'x_j' and the overall quality of fit.

    Produce a summary report with diagnostics for each predictor 'x_j' and the overall quality of fit.

    b

    the parameters/coefficients for the model

    vf

    the Variance Inflation Factors (VIFs)

    show

    flag indicating whether to print the summary

    Definition Classes
    Fit
  64. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  65. def test(modelName: String, doPlot: Boolean = true): Unit

    Test the model on the full dataset (i.e., train and evaluate on full dataset).

    Test the model on the full dataset (i.e., train and evaluate on full dataset).

    modelName

    the name of the model being tested

    doPlot

    whether to plot the actual vs. predicted response

    Definition Classes
    Predictor
  66. def toString(): String
    Definition Classes
    AnyRef → Any
  67. def train(interval: VectoI): Unit

    Train the regression tree by selecting threshold for all the features in interval (subsamples).

    Train the regression tree by selecting threshold for all the features in interval (subsamples).

    interval

    only the values in interval will be used in selecting threshold

  68. def train(x_: MatriD, y_: VectoD): RegressionTree

    Train the regression tree by selecting threshold for all the features in 'y_' (can be used as all the samples or subsamples).

    Train the regression tree by selecting threshold for all the features in 'y_' (can be used as all the samples or subsamples).

    x_

    the training/full data/input matrix

    y_

    the training/full response/output vector only the values in y_ will be used in selecting threshold

    Definition Classes
    RegressionTreePredictorMatModel
  69. def train2(x_: MatriD = x, y_: VectoD = y): PredictorMat

    Train a predictive model 'y_ = f(x_) + e' where 'x_' is the data/input matrix and 'y_' is the response/output vector.

    Train a predictive model 'y_ = f(x_) + e' where 'x_' is the data/input matrix and 'y_' is the response/output vector. These arguments default to the full dataset 'x' and 'y', but may be restricted to a training dataset. Training involves estimating the model parameters 'b'. The 'train2' method should work like the 'train' method, but should also optimize hyper-parameters (e.g., shrinkage or learning rate). Only implementing classes needing this capability should implement this method.

    x_

    the training/full data/input matrix (defaults to full x)

    y_

    the training/full response/output vector (defaults to full y)

    Definition Classes
    PredictorMat
  70. def vif(skip: Int = 1): VectoD

    Compute the Variance Inflation Factor 'VIF' for each variable to test for multi-collinearity by regressing 'x_j' against the rest of the variables.

    Compute the Variance Inflation Factor 'VIF' for each variable to test for multi-collinearity by regressing 'x_j' against the rest of the variables. A VIF over 10 indicates that over 90% of the variance of 'x_j' can be predicted from the other variables, so 'x_j' may be a candidate for removal from the model. Note: override this method to use a superior regression technique.

    skip

    the number of columns of x at the beginning to skip in computing VIF

    Definition Classes
    PredictorMat
  71. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  72. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  73. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  74. val x: MatriD
    Attributes
    protected
    Definition Classes
    PredictorMat
  75. val y: VectoD
    Attributes
    protected
    Definition Classes
    PredictorMat

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated
    Deprecated

Inherited from PredictorMat

Inherited from Predictor

Inherited from Model

Inherited from Fit

Inherited from Error

Inherited from QoF

Inherited from AnyRef

Inherited from Any

Ungrouped