Packages

c

scalation.analytics

RidgeRegression

class RidgeRegression extends PredictorMat

The RidgeRegression class supports multiple linear ridge regression. In this case, 'x' is multi-dimensional [x_1, ... x_k]. Ridge regression puts a penalty on the L2 norm of the parameters b to reduce the chance of them taking on large values that may lead to less robust models. Both the input matrix 'x' and the response vector 'y' are centered (zero mean). Fit the parameter vector 'b' in the regression equation

y = b dot x + e = b_1 * x_1 + ... b_k * x_k + e

where 'e' represents the residuals (the part not explained by the model). Use Least-Squares (minimizing the residuals) to solve for the parameter vector 'b' using the regularized Normal Equations:

b = fac.solve (.) with regularization x.t * x + λ * I

Five factorization techniques are provided:

'QR' // QR Factorization: slower, more stable (default) 'Cholesky' // Cholesky Factorization: faster, less stable (reasonable choice) 'SVD' // Singular Value Decomposition: slowest, most robust 'LU' // LU Factorization: similar, but better than inverse 'Inverse' // Inverse/Gaussian Elimination, classical textbook technique

See also

statweb.stanford.edu/~tibs/ElemStatLearn/

Linear Supertypes
PredictorMat, Error, Predictor, Fit, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. RidgeRegression
  2. PredictorMat
  3. Error
  4. Predictor
  5. Fit
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new RidgeRegression(x: MatriD, y: VectoD, lambda_: Double = 0.1, technique: RegTechnique = Cholesky)

    x

    the centered input/design m-by-n matrix NOT augmented with a first column of ones

    y

    the centered response m-vector

    lambda_

    the shrinkage parameter (0 => OLS) in the penalty term 'lambda * b dot b'

    technique

    the technique used to solve for b in (x.t*x + lambda*I)*b = x.t*y

Type Members

  1. type Fac_QR = Fac_QR_H[MatriD]

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. val b: VectoD
    Attributes
    protected
    Definition Classes
    Predictor
  6. def backwardElim(cols: Set[Int]): (Int, VectoD, VectoD)

    Perform backward elimination to remove the least predictive variable from the existing model, returning the variable to eliminate, the new parameter vector, the new quality of fit.

    Perform backward elimination to remove the least predictive variable from the existing model, returning the variable to eliminate, the new parameter vector, the new quality of fit. May be called repeatedly. FIX - update implementation

    cols

    the columns of matrix x to be included in the existing model

  7. def clone(): AnyRef
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @native() @throws( ... )
  8. def coefficient: VectoD

    Return the vector of coefficient/parameter values.

    Return the vector of coefficient/parameter values.

    Definition Classes
    Predictor
  9. def crossVal(k: Int = 10): Unit

    Perform 'k'-fold cross-validation.

    Perform 'k'-fold cross-validation.

    k

    the number of folds

    Definition Classes
    RidgeRegressionPredictorMat
  10. def crossValidate(algor: (MatriD, VectoD) ⇒ PredictorMat, k: Int = 10): Array[Statistic]
    Definition Classes
    PredictorMat
  11. val df: (Double, Double)
    Definition Classes
    Fit
  12. def diagnose(e: VectoD, w: VectoD = null, yp: VectoD = null): Unit

    Given the error/residual vector, compute the quality of fit measures.

    Given the error/residual vector, compute the quality of fit measures.

    e

    the corresponding m-dimensional error vector (y - yp)

    w

    the weights on the instances

    yp

    the predicted response vector (x * b)

    Definition Classes
    Fit
  13. val e: VectoD
    Attributes
    protected
    Definition Classes
    Predictor
  14. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  15. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  16. def eval(xx: MatriD, yy: VectoD): Unit

    Compute the error and useful diagnostics for the test dataset.

    Compute the error and useful diagnostics for the test dataset.

    xx

    the test data matrix

    yy

    the test response vector

    Definition Classes
    PredictorMatPredictor
  17. def eval(): Unit

    Compute the error and useful diagnostics for the entire dataset.

    Compute the error and useful diagnostics for the entire dataset.

    Definition Classes
    PredictorMatPredictor
  18. def f_(z: Double): String

    Format a double value.

    Format a double value.

    z

    the double value to format

    Definition Classes
    Fit
  19. def finalize(): Unit
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  20. def fit: VectoD

    Return the quality of fit including 'sst', 'sse', 'mse0', rmse', 'mae', 'rSq', 'df._2', 'rBarSq', 'fStat', 'aic', 'bic'.

    Return the quality of fit including 'sst', 'sse', 'mse0', rmse', 'mae', 'rSq', 'df._2', 'rBarSq', 'fStat', 'aic', 'bic'. Note, if 'sse > sst', the model introduces errors and the 'rSq' may be negative, otherwise, R^2 ('rSq') ranges from 0 (weak) to 1 (strong). Note that 'rSq' is the number 5 measure. Override to add more quality of fit measures.

    Definition Classes
    Fit
  21. def fitLabel: Seq[String]

    Return the labels for the quality of fit measures.

    Return the labels for the quality of fit measures. Override to add more quality of fit measures.

    Definition Classes
    Fit
  22. def fitMap: Map[String, String]

    Build a map of quality of fit measures (use of LinedHashMap makes it ordered).

    Build a map of quality of fit measures (use of LinedHashMap makes it ordered). Override to add more quality of fit measures.

    Definition Classes
    Fit
  23. final def flaw(method: String, message: String): Unit
    Definition Classes
    Error
  24. def forwardSel(cols: Set[Int]): (Int, VectoD, VectoD)

    Perform forward selection to add the most predictive variable to the existing model, returning the variable to add, the new parameter vector and the new quality of fit.

    Perform forward selection to add the most predictive variable to the existing model, returning the variable to add, the new parameter vector and the new quality of fit. May be called repeatedly.

    cols

    the columns of matrix x included in the existing model

  25. def gcv(yy: VectoD): Double

    Find an optimal value for the shrinkage parameter 'λ' using Generalized Cross Validation (GCV).

    Find an optimal value for the shrinkage parameter 'λ' using Generalized Cross Validation (GCV).

    yy

    the response vector

  26. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  27. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  28. val index_rSq: Int
    Definition Classes
    Fit
  29. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  30. val k: Int
    Attributes
    protected
    Definition Classes
    PredictorMat
  31. val m: Int
    Attributes
    protected
    Definition Classes
    PredictorMat
  32. def mse_: Double

    Return the mean of squares for error (sse / df._2).

    Return the mean of squares for error (sse / df._2). Must call diagnose first.

    Definition Classes
    Fit
  33. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  34. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  35. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  36. def predict(z: MatriD): VectoD

    Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', for each row of matrix 'z'.

    Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', for each row of matrix 'z'.

    z

    the new matrix to predict

    Definition Classes
    PredictorMat
  37. def predict(z: VectoD): Double

    Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', e.g., '(b_0, b_1, b_2) dot (1, z_1, z_2)'.

    Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', e.g., '(b_0, b_1, b_2) dot (1, z_1, z_2)'.

    z

    the new vector to predict

    Definition Classes
    PredictorMatPredictor
  38. def predict(z: VectoI): Double

    Given a new discrete data vector z, predict the y-value of f(z).

    Given a new discrete data vector z, predict the y-value of f(z).

    z

    the vector to use for prediction

    Definition Classes
    Predictor
  39. def residual: VectoD

    Return the vector of residuals/errors.

    Return the vector of residuals/errors.

    Definition Classes
    Predictor
  40. def sumCoeff(b: VectoD, stdErr: VectoD = null): String

    Produce the summary report portion for the cofficients.

    Produce the summary report portion for the cofficients.

    b

    the parameters/coefficients for the model

    Definition Classes
    Fit
  41. def summary(): Unit

    Compute diagostics for the regression model.

    Compute diagostics for the regression model.

    Definition Classes
    PredictorMat
  42. def summary(b: VectoD, stdErr: VectoD = null): String

    Produce a summary report with diagnostics for each predictor 'x_j' and the overall quality of fit.

    Produce a summary report with diagnostics for each predictor 'x_j' and the overall quality of fit.

    b

    the parameters/coefficients for the model

    Definition Classes
    Fit
  43. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  44. def toString(): String
    Definition Classes
    AnyRef → Any
  45. def train(yy: VectoD = y): RidgeRegression

    Train the predictor by fitting the parameter vector (b-vector) in the multiple regression equation

    Train the predictor by fitting the parameter vector (b-vector) in the multiple regression equation

    yy = b dot x + e = [b_1, ... b_k] dot [x_1, ... x_k] + e

    using the least squares method.

    yy

    the response vector

    Definition Classes
    RidgeRegressionPredictorMatPredictor
  46. def train(): PredictorMat

    Given a set of data vectors 'x's and their corresponding responses 'y's, passed into the implementing class, train the prediction function 'y = f(x)' by fitting its parameters.

    Given a set of data vectors 'x's and their corresponding responses 'y's, passed into the implementing class, train the prediction function 'y = f(x)' by fitting its parameters.

    Definition Classes
    PredictorMat
  47. def vif: VectoD

    Compute the Variance Inflation Factor 'VIF' for each variable to test for multi-collinearity by regressing 'xj' against the rest of the variables.

    Compute the Variance Inflation Factor 'VIF' for each variable to test for multi-collinearity by regressing 'xj' against the rest of the variables. A VIF over 10 indicates that over 90% of the variance of 'xj' can be predicted from the other variables, so 'xj' is a candidate for removal from the model.

  48. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  49. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  50. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @throws( ... )
  51. val x: MatriD
    Attributes
    protected
    Definition Classes
    PredictorMat
  52. def xtx_λI(λ: Double): Unit

    Compute x.t * x + λI.

    Compute x.t * x + λI.

    λ

    the shrinkage parameter

  53. val y: VectoD
    Attributes
    protected
    Definition Classes
    PredictorMat

Inherited from PredictorMat

Inherited from Error

Inherited from Predictor

Inherited from Fit

Inherited from AnyRef

Inherited from Any

Ungrouped