class ClusteringPredictor extends PredictorMat

The ClusteringPredictor class is used to predict a response value for new vector 'z'. It works by finding the cluster that the point 'z' would belong to. The recorded response value for 'y' is then given as the predicted response. The per cluster recorded reponse value is the consensus (e.g., average) of the individual predictions for 'z' from the members of the cluster. Training involves clustering the points in data matrix 'x' and then computing each clusters reponse.

Linear Supertypes
PredictorMat, Predictor, Model, Fit, Error, QoF, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ClusteringPredictor
  2. PredictorMat
  3. Predictor
  4. Model
  5. Fit
  6. Error
  7. QoF
  8. AnyRef
  9. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ClusteringPredictor(x: MatriD, y: VectoD, fname_: Strings = null, hparam: HyperParameter = ClusteringPredictor.hp)

    x

    the vectors/points of predictor data stored as rows of a matrix

    y

    the response value for each vector in x

    fname_

    the names for all features/variables

    hparam

    the number of nearest neighbors to consider

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. def analyze(x_: MatriD = x, y_: VectoD = y, x_e: MatriD = x, y_e: VectoD = y): PredictorMat

    Analyze a dataset using this model using ordinary training with the 'train' method.

    Analyze a dataset using this model using ordinary training with the 'train' method.

    x_

    the training/full data/input matrix

    y_

    the training/full response/output vector

    x_e

    the test/full data/input matrix

    y_e

    the test/full response/output vector

    Definition Classes
    PredictorMatPredictor
  5. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  6. var b: VectoD
    Attributes
    protected
    Definition Classes
    PredictorMat
  7. def backwardElim(cols: Set[Int], index_q: Int = index_rSqBar, first: Int = 1): (Int, PredictorMat)

    Perform backward elimination to find the least predictive variable to remove from the existing model, returning the variable to eliminate, the new parameter vector and the new Quality of Fit (QoF).

    Perform backward elimination to find the least predictive variable to remove from the existing model, returning the variable to eliminate, the new parameter vector and the new Quality of Fit (QoF). May be called repeatedly.

    cols

    the columns of matrix x currently included in the existing model

    index_q

    index of Quality of Fit (QoF) to use for comparing quality

    first

    first variable to consider for elimination (default (1) assume intercept x_0 will be in any model)

    Definition Classes
    PredictorMat
    See also

    Fit for index of QoF measures.

  8. def backwardElimAll(index_q: Int = index_rSqBar, first: Int = 1, cross: Boolean = true): (Set[Int], MatriD)

    Perform backward elimination to find the least predictive variables to remove from the full model, returning the variables left and the new Quality of Fit (QoF) measures for all steps.

    Perform backward elimination to find the least predictive variables to remove from the full model, returning the variables left and the new Quality of Fit (QoF) measures for all steps.

    index_q

    index of Quality of Fit (QoF) to use for comparing quality

    first

    first variable to consider for elimination

    cross

    whether to include the cross-validation QoF measure

    Definition Classes
    PredictorMat
    See also

    Fit for index of QoF measures.

  9. def buildModel(x_cols: MatriD): PredictorMat

    Build a sub-model that is restricted to the given columns of the data matrix.

    Build a sub-model that is restricted to the given columns of the data matrix.

    x_cols

    the columns that the new model is restricted to

    Definition Classes
    ClusteringPredictorPredictorMat
  10. def classify(z: VectoD): Int

    Given a new point/vector 'z', classify it according to the cluster it belongs to.

    Given a new point/vector 'z', classify it according to the cluster it belongs to.

    z

    the vector to classify

  11. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @HotSpotIntrinsicCandidate()
  12. def corrMatrix(xx: MatriD = x): MatriD

    Return the correlation matrix for the columns in data matrix 'xx'.

    Return the correlation matrix for the columns in data matrix 'xx'.

    xx

    the data matrix shose correlation matrix is sought

    Definition Classes
    PredictorMatPredictor
  13. def crossValidate(k: Int = 10, rando: Boolean = true): Array[Statistic]
    Definition Classes
    PredictorMat
  14. def diagnose(e: VectoD, yy: VectoD, yp: VectoD, w: VectoD = null, ym_: Double = noDouble): Unit

    Diagnose the health of the model by computing the Quality of Fit (QoF) measures, from the error/residual vector and the predicted & actual responses.

    Diagnose the health of the model by computing the Quality of Fit (QoF) measures, from the error/residual vector and the predicted & actual responses. For some models the instances may be weighted.

    e

    the m-dimensional error/residual vector (yy - yp)

    yy

    the actual response/output vector to use (test/full)

    yp

    the predicted response/output vector (test/full)

    w

    the weights on the instances (defaults to null)

    ym_

    the mean of the actual response/output vector to use (training/full)

    Definition Classes
    FitQoF
    See also

    Regression_WLS

  15. var e: VectoD
    Attributes
    protected
    Definition Classes
    PredictorMat
  16. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  17. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  18. def eval(xx: MatriD = x, yy: VectoD = y): ClusteringPredictor

    Compute the error and useful diagnostics.

    Compute the error and useful diagnostics. Requires override to adjust degrees of freedom (df1, df2).

    xx

    the data matrix used in prediction

    yy

    the actual response vector

    Definition Classes
    ClusteringPredictorPredictorMatModel
  19. def eval(ym: Double, y_e: VectoD, yp: VectoD): PredictorMat

    Compute the error (difference between actual and predicted) and useful diagnostics for the test dataset.

    Compute the error (difference between actual and predicted) and useful diagnostics for the test dataset. Requires predicted responses to be passed in.

    ym

    the training/full mean actual response/output vector

    y_e

    the test/full actual response/output vector

    yp

    the test/full predicted response/output vector

    Definition Classes
    PredictorMat
  20. def f_(z: Double): String

    Format a double value.

    Format a double value.

    z

    the double value to format

    Definition Classes
    QoF
  21. def fit: VectoD

    Return the Quality of Fit (QoF) measures corresponding to the labels given above in the 'fitLabel' method.

    Return the Quality of Fit (QoF) measures corresponding to the labels given above in the 'fitLabel' method. Note, if 'sse > sst', the model introduces errors and the 'rSq' may be negative, otherwise, R^2 ('rSq') ranges from 0 (weak) to 1 (strong). Override to add more quality of fit measures.

    Definition Classes
    FitQoF
  22. def fitLabel: Seq[String]

    Return the labels for the Quality of Fit (QoF) measures.

    Return the labels for the Quality of Fit (QoF) measures. Override to add additional QoF measures.

    Definition Classes
    FitQoF
  23. def fitMap: Map[String, String]

    Build a map of quality of fit measures (use of LinkedHashMap makes it ordered).

    Build a map of quality of fit measures (use of LinkedHashMap makes it ordered).

    Definition Classes
    QoF
  24. final def flaw(method: String, message: String): Unit
    Definition Classes
    Error
  25. var fname: Strings
    Attributes
    protected
    Definition Classes
    PredictorMat
  26. def forwardSel(cols: Set[Int], index_q: Int = index_rSqBar): (Int, PredictorMat)

    Perform forward selection to find the most predictive variable to add the existing model, returning the variable to add and the new model.

    Perform forward selection to find the most predictive variable to add the existing model, returning the variable to add and the new model. May be called repeatedly.

    cols

    the columns of matrix x currently included in the existing model

    index_q

    index of Quality of Fit (QoF) to use for comparing quality

    Definition Classes
    PredictorMatPredictor
    See also

    Fit for index of QoF measures.

  27. def forwardSelAll(index_q: Int = index_rSqBar, cross: Boolean = true): (Set[Int], MatriD)

    Perform forward selection to find the most predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.

    Perform forward selection to find the most predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.

    index_q

    index of Quality of Fit (QoF) to use for comparing quality

    cross

    whether to include the cross-validation QoF measure

    Definition Classes
    PredictorMat
    See also

    Fit for index of QoF measures.

  28. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  29. def getX: MatriD

    Return the 'used' data matrix 'x'.

    Return the 'used' data matrix 'x'. Mainly for derived classes where 'x' is expanded from the given columns in 'x_', e.g., QuadRegression add squared columns.

    Definition Classes
    PredictorMatPredictor
  30. def getY: VectoD

    Return the 'used' response vector 'y'.

    Return the 'used' response vector 'y'. Mainly for derived classes where 'y' is transformed, e.g., TranRegression, Regression4TS.

    Definition Classes
    PredictorMatPredictor
  31. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  32. def help: String

    Return the help string that describes the Quality of Fit (QoF) measures provided by the Fit class.

    Return the help string that describes the Quality of Fit (QoF) measures provided by the Fit class. Override to correspond to 'fitLabel'.

    Definition Classes
    FitQoF
  33. def hparameter: HyperParameter

    Return the hyper-parameters.

    Return the hyper-parameters.

    Definition Classes
    PredictorMatModel
  34. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  35. val k: Int
    Attributes
    protected
    Definition Classes
    PredictorMat
  36. def ll(ms: Double = mse0, s2: Double = sig2e, m2: Int = m): Double

    The log-likelihood function times -2.

    The log-likelihood function times -2. Override as needed.

    ms

    raw Mean Squared Error

    s2

    MLE estimate of the population variance of the residuals

    Definition Classes
    Fit
    See also

    www.stat.cmu.edu/~cshalizi/mreg/15/lectures/06/lecture-06.pdf

    www.wiley.com/en-us/Introduction+to+Linear+Regression+Analysis%2C+5th+Edition-p-9780470542811 Section 2.11

  37. val m: Int
    Attributes
    protected
    Definition Classes
    PredictorMat
  38. val modelConcept: URI

    An optional reference to an ontological concept

    An optional reference to an ontological concept

    Definition Classes
    Model
  39. def modelName: String

    An optional name for the model (or modeling technique)

    An optional name for the model (or modeling technique)

    Definition Classes
    Model
  40. def mse_: Double

    Return the mean of squares for error (sse / df._2).

    Return the mean of squares for error (sse / df._2). Must call diagnose first.

    Definition Classes
    Fit
  41. val n: Int
    Attributes
    protected
    Definition Classes
    PredictorMat
  42. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  43. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  44. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  45. def parameter: VectoD

    Return the vector of parameter/coefficient values.

    Return the vector of parameter/coefficient values.

    Definition Classes
    PredictorMatModel
  46. def predict(z: VectoD): Double

    Given a new point/vector 'z', predict its response value based on the average response from its clsuter.

    Given a new point/vector 'z', predict its response value based on the average response from its clsuter.

    z

    the vector to predict

    Definition Classes
    ClusteringPredictorPredictorMatPredictor
  47. def predict(z: MatriD = x): VectoD

    Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', for each row of matrix 'z'.

    Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', for each row of matrix 'z'.

    z

    the new matrix to predict

    Definition Classes
    PredictorMatPredictor
  48. def predict(z: VectoI): Double

    Given a new discrete data/input vector 'z', predict the 'y'-value of 'f(z)'.

    Given a new discrete data/input vector 'z', predict the 'y'-value of 'f(z)'.

    z

    the vector to use for prediction

    Definition Classes
    Predictor
  49. def report: String

    Return a basic report on the trained model.

    Return a basic report on the trained model.

    Definition Classes
    PredictorMatModel
    See also

    'summary' method for more details

  50. def reset(): Unit

    Reset or re-initialize 'topK' and counters.

  51. def resetDF(df_update: PairD): Unit

    Reset the degrees of freedom to the new updated values.

    Reset the degrees of freedom to the new updated values. For some models, the degrees of freedom is not known until after the model is built.

    df_update

    the updated degrees of freedom (model, error)

    Definition Classes
    Fit
  52. def residual: VectoD

    Return the vector of residuals/errors.

    Return the vector of residuals/errors.

    Definition Classes
    PredictorMatPredictor
  53. def reverse(a: MatriD): MatriD

    Return a matrix that is in reverse row order of the given matrix 'a'.

    Return a matrix that is in reverse row order of the given matrix 'a'.

    a

    the given matrix

    Definition Classes
    PredictorMat
  54. var sig2e: Double
    Attributes
    protected
    Definition Classes
    Fit
  55. def stepRegressionAll(index_q: Int = index_rSqBar, cross: Boolean = true): (Set[Int], MatriD)

    Perform stepwise regression to find the most predictive variables to have in the model, returning the variables left and the new Quality of Fit (QoF) measures for all steps.

    Perform stepwise regression to find the most predictive variables to have in the model, returning the variables left and the new Quality of Fit (QoF) measures for all steps. At each step it calls 'forwardSel' and 'backwardElim' and takes the best of the two actions. Stops when neither action yields improvement.

    index_q

    index of Quality of Fit (QoF) to use for comparing quality

    cross

    whether to include the cross-validation QoF measure

    Definition Classes
    PredictorMat
    See also

    Fit for index of QoF measures.

  56. def summary: String

    Compute and return summary diagostics for the regression model.

    Compute and return summary diagostics for the regression model.

    Definition Classes
    PredictorMat
  57. def summary(b: VectoD, stdErr: VectoD, vf: VectoD, show: Boolean = false): String

    Produce a summary report with diagnostics for each predictor 'x_j' and the overall quality of fit.

    Produce a summary report with diagnostics for each predictor 'x_j' and the overall quality of fit.

    b

    the parameters/coefficients for the model

    vf

    the Variance Inflation Factors (VIFs)

    show

    flag indicating whether to print the summary

    Definition Classes
    Fit
  58. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  59. def test(modelName: String, doPlot: Boolean = true): Unit

    Test the model on the full dataset (i.e., train and evaluate on full dataset).

    Test the model on the full dataset (i.e., train and evaluate on full dataset).

    modelName

    the name of the model being tested

    doPlot

    whether to plot the actual vs. predicted response

    Definition Classes
    Predictor
  60. def toString(): String
    Definition Classes
    AnyRef → Any
  61. def train(xx: MatriD = x, yy: VectoD = y): ClusteringPredictor

    Training involves resetting the data structures before each prediction.

    Training involves resetting the data structures before each prediction. It uses lazy training, so most of it is done during prediction.

    xx

    the data/input matrix

    yy

    the response/output vector

    Definition Classes
    ClusteringPredictorPredictorMatModel
  62. def train2(x_: MatriD = x, y_: VectoD = y): PredictorMat

    Train a predictive model 'y_ = f(x_) + e' where 'x_' is the data/input matrix and 'y_' is the response/output vector.

    Train a predictive model 'y_ = f(x_) + e' where 'x_' is the data/input matrix and 'y_' is the response/output vector. These arguments default to the full dataset 'x' and 'y', but may be restricted to a training dataset. Training involves estimating the model parameters 'b'. The 'train2' method should work like the 'train' method, but should also optimize hyper-parameters (e.g., shrinkage or learning rate). Only implementing classes needing this capability should implement this method.

    x_

    the training/full data/input matrix (defaults to full x)

    y_

    the training/full response/output vector (defaults to full y)

    Definition Classes
    PredictorMat
  63. def vif(skip: Int = 1): VectoD

    Compute the Variance Inflation Factor 'VIF' for each variable to test for multi-collinearity by regressing 'x_j' against the rest of the variables.

    Compute the Variance Inflation Factor 'VIF' for each variable to test for multi-collinearity by regressing 'x_j' against the rest of the variables. A VIF over 10 indicates that over 90% of the variance of 'x_j' can be predicted from the other variables, so 'x_j' may be a candidate for removal from the model. Note: override this method to use a superior regression technique.

    skip

    the number of columns of x at the beginning to skip in computing VIF

    Definition Classes
    PredictorMat
  64. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  65. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  66. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  67. val x: MatriD
    Attributes
    protected
    Definition Classes
    PredictorMat
  68. val y: VectoD
    Attributes
    protected
    Definition Classes
    PredictorMat

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated
    Deprecated

Inherited from PredictorMat

Inherited from Predictor

Inherited from Model

Inherited from Fit

Inherited from Error

Inherited from QoF

Inherited from AnyRef

Inherited from Any

Ungrouped