class RidgeRegression extends PredictorMat
The RidgeRegression
class supports multiple linear regression. In this
case, 'x' is multi-dimensional [x_1, ... x_k]. Both the input matrix 'x' and
the response vector 'y' are centered (zero mean). Fit the parameter vector
'b' in the regression equation
y = b dot x + e = b_1 * x_1 + ... b_k * x_k + e
where 'e' represents the residuals (the part not explained by the model). Use Least-Squares (minimizing the residuals) to fit the parameter vector
b = x_pinv * y [ alternative: b = solve (y) ]
where 'x_pinv' is the pseudo-inverse. Three techniques are provided:
'Fac_QR' // QR Factorization: slower, more stable (default) 'Fac_Cholesky' // Cholesky Factorization: faster, less stable (reasonable choice) 'Inverse' // Inverse/Gaussian Elimination, classical textbook technique (outdated)
This version uses parallel processing to speed up execution.
- See also
statweb.stanford.edu/~tibs/ElemStatLearn/
- Alphabetic
- By Inheritance
- RidgeRegression
- PredictorMat
- Error
- Predictor
- Fit
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
RidgeRegression(x: MatrixD, y: VectorD, lambda: Double = 0.1, technique: RegTechnique = Inverse)
- x
the centered input/design m-by-n matrix NOT augmented with a first column of ones
- y
the centered response vector
- lambda
the shrinkage parameter (0 => OLS) in the penalty term 'lambda * b dot b'
- technique
the technique used to solve for b in x.t*x*b = x.t*y
Type Members
- type Fac_QR = Fac_QR_H[MatrixD]
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
val
b: VectoD
- Attributes
- protected
- Definition Classes
- Predictor
-
def
backElim(): (Int, VectoD, VectoD)
Perform backward elimination to remove the least predictive variable from the model, returning the variable to eliminate, the new parameter vector, the new R-squared value and the new F statistic.
-
def
clone(): AnyRef
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
-
def
coefficient: VectoD
Return the vector of coefficient/parameter values.
Return the vector of coefficient/parameter values.
- Definition Classes
- Predictor
-
def
crossVal(k: Int = 10): Unit
Perform 'k'-fold cross-validation.
Perform 'k'-fold cross-validation. FIX
- k
the number of folds
- Definition Classes
- RidgeRegression → PredictorMat
-
def
crossValidate(algor: (MatriD, VectoD) ⇒ PredictorMat, k: Int = 10): Array[Statistic]
- Definition Classes
- PredictorMat
-
val
df: (Double, Double)
- Definition Classes
- Fit
-
def
diagnose(e: VectoD, w: VectoD = null, yp: VectoD = null): Unit
Given the error/residual vector, compute the quality of fit measures.
Given the error/residual vector, compute the quality of fit measures.
- e
the corresponding m-dimensional error vector (y - yp)
- w
the weights on the instances
- yp
the predicted response vector (x * b)
- Definition Classes
- Fit
-
val
e: VectoD
- Attributes
- protected
- Definition Classes
- Predictor
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
eval(): Unit
Compute the error and useful diagnostics.
Compute the error and useful diagnostics.
- Definition Classes
- RidgeRegression → PredictorMat → Predictor
-
def
eval(xx: MatriD, yy: VectoD): Unit
Compute the error and useful diagnostics for the test dataset.
Compute the error and useful diagnostics for the test dataset.
- xx
the test data matrix
- yy
the test response vector
- Definition Classes
- PredictorMat → Predictor
-
def
f_(z: Double): String
Format a double value.
-
def
finalize(): Unit
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
fit: VectoD
Return the quality of fit including 'sst', 'sse', 'mse0', rmse', 'mae', 'rSq', 'df._2', 'rBarSq', 'fStat', 'aic', 'bic'.
Return the quality of fit including 'sst', 'sse', 'mse0', rmse', 'mae', 'rSq', 'df._2', 'rBarSq', 'fStat', 'aic', 'bic'. Note, if 'sse > sst', the model introduces errors and the 'rSq' may be negative, otherwise, R^2 ('rSq') ranges from 0 (weak) to 1 (strong). Note that 'rSq' is the number 5 measure. Override to add more quality of fit measures.
- Definition Classes
- Fit
-
def
fitLabel: Seq[String]
Return the labels for the quality of fit measures.
Return the labels for the quality of fit measures. Override to add more quality of fit measures.
- Definition Classes
- Fit
-
def
fitMap: Map[String, String]
Build a map of quality of fit measures (use of
LinedHashMap
makes it ordered).Build a map of quality of fit measures (use of
LinedHashMap
makes it ordered). Override to add more quality of fit measures.- Definition Classes
- Fit
-
final
def
flaw(method: String, message: String): Unit
- Definition Classes
- Error
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
val
index_rSq: Int
- Definition Classes
- Fit
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
val
k: Int
- Attributes
- protected
- Definition Classes
- PredictorMat
-
val
m: Int
- Attributes
- protected
- Definition Classes
- PredictorMat
-
def
mse_: Double
Return the mean of squares for error (sse / df._2).
Return the mean of squares for error (sse / df._2). Must call diagnose first.
- Definition Classes
- Fit
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
predict(z: VectoD): Double
Predict the value of y = f(z) by evaluating the formula below.
Predict the value of y = f(z) by evaluating the formula below.
- z
the new vector to predict
- Definition Classes
- RidgeRegression → PredictorMat → Predictor
-
def
predict(z: MatriD): VectoD
Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', for each row of matrix 'z'.
Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', for each row of matrix 'z'.
- z
the new matrix to predict
- Definition Classes
- PredictorMat
-
def
predict(z: VectoI): Double
Given a new discrete data vector z, predict the y-value of f(z).
Given a new discrete data vector z, predict the y-value of f(z).
- z
the vector to use for prediction
- Definition Classes
- Predictor
-
def
residual: VectoD
Return the vector of residuals/errors.
Return the vector of residuals/errors.
- Definition Classes
- Predictor
-
def
sumCoeff(b: VectoD, stdErr: VectoD = null): String
Produce the summary report portion for the cofficients.
Produce the summary report portion for the cofficients.
- b
the parameters/coefficients for the model
- Definition Classes
- Fit
-
def
summary(): Unit
Compute diagostics for the regression model.
Compute diagostics for the regression model.
- Definition Classes
- PredictorMat
-
def
summary(b: VectoD, stdErr: VectoD = null): String
Produce a summary report with diagnostics for each predictor 'x_j' and the overall quality of fit.
Produce a summary report with diagnostics for each predictor 'x_j' and the overall quality of fit.
- b
the parameters/coefficients for the model
- Definition Classes
- Fit
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
def
train(yy: VectoD): RidgeRegression
Retrain the predictor by fitting the parameter vector (b-vector) in the multiple regression equation yy = b dot x + e = [b_1, ...
Retrain the predictor by fitting the parameter vector (b-vector) in the multiple regression equation yy = b dot x + e = [b_1, ... b_k] dot [x_1, ... x_k] + e using the least squares method.
- yy
the new response vector
- Definition Classes
- RidgeRegression → PredictorMat → Predictor
-
def
train(): RidgeRegression
Train the predictor by fitting the parameter vector (b-vector) in the multiple regression equation y = b dot x + e = [b_1, ...
Train the predictor by fitting the parameter vector (b-vector) in the multiple regression equation y = b dot x + e = [b_1, ... b_k] dot [x_1, ... x_k] + e using the least squares method.
- Definition Classes
- RidgeRegression → PredictorMat
-
def
vif: VectorD
Compute the Variance Inflation Factor 'VIF' for each variable to test for multi-collinearity by regressing 'xj' against the rest of the variables.
Compute the Variance Inflation Factor 'VIF' for each variable to test for multi-collinearity by regressing 'xj' against the rest of the variables. A VIF over 10 indicates that over 90% of the variance of 'xj' can be predicted from the other variables, so 'xj' is a candidate for removal from the model.
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
-
val
x: MatriD
- Attributes
- protected
- Definition Classes
- PredictorMat
-
def
xtx: MatrixD
Compute x.t * x and add lambda to the diagonal
-
val
y: VectoD
- Attributes
- protected
- Definition Classes
- PredictorMat