class RidgeRegression extends PredictorMat
The RidgeRegression
class supports multiple linear ridge regression.
In this case, 'x' is multi-dimensional [x_1, ... x_k]. Ridge regression puts
a penalty on the L2 norm of the parameters b to reduce the chance of them taking
on large values that may lead to less robust models. Both the input matrix 'x'
and the response vector 'y' are centered (zero mean). Fit the parameter vector
'b' in the regression equation
y = b dot x + e = b_1 * x_1 + ... b_k * x_k + e
where 'e' represents the residuals (the part not explained by the model). Use Least-Squares (minimizing the residuals) to solve for the parameter vector 'b' using the regularized Normal Equations:
b = fac.solve (.) with regularization x.t * x + λ * I
Five factorization techniques are provided:
'QR' // QR Factorization: slower, more stable (default) 'Cholesky' // Cholesky Factorization: faster, less stable (reasonable choice) 'SVD' // Singular Value Decomposition: slowest, most robust 'LU' // LU Factorization: similar, but better than inverse 'Inverse' // Inverse/Gaussian Elimination, classical textbook technique
- See also
- Alphabetic
- By Inheritance
- RidgeRegression
- PredictorMat
- Error
- Predictor
- Fit
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
RidgeRegression(x: MatriD, y: VectoD, fname_: Strings = null, hparam: HyperParameter = RidgeRegression.hp, technique: RegTechnique = Cholesky)
- x
the centered input/design m-by-n matrix NOT augmented with a first column of ones
- y
the centered response m-vector
- fname_
the feature/variable names
- hparam
the shrinkage hyper-parameter, lambda (0 => OLS) in the penalty term 'lambda * b dot b'
- technique
the technique used to solve for b in (x.t*x + lambda*I)*b = x.t*y
Type Members
- type Fac_QR = Fac_QR_H[MatriD]
Value Members
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
##(): Int
- Definition Classes
- AnyRef → Any
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
asInstanceOf[T0]: T0
- Definition Classes
- Any
b: VectoD
- Attributes
- protected
- Definition Classes
- Predictor
backwardElim(cols: Set[Int]): (Int, VectoD, VectoD)
Perform backward elimination to remove the least predictive variable from the existing model, returning the variable to eliminate, the new parameter vector, the new quality of fit.
Perform backward elimination to remove the least predictive variable from the existing model, returning the variable to eliminate, the new parameter vector, the new quality of fit. May be called repeatedly. FIX - update implementation
- cols
the columns of matrix x to be included in the existing model
clone(): AnyRef
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
crossVal(k: Int = 10, rando: Boolean = true): Unit
Perform 'k'-fold cross-validation.
Perform 'k'-fold cross-validation.
- k
the number of folds
- rando
whether to use randomized cross-validation
- Definition Classes
- RidgeRegression → PredictorMat
crossValidate(algor: (MatriD, VectoD) ⇒ PredictorMat, k: Int = 10, rando: Boolean = true): Array[Statistic]
- Definition Classes
- PredictorMat
diagnose(e: VectoD, w: VectoD = null, yp: VectoD = null, y_: VectoD = y): Unit
Given the error/residual vector, compute the quality of fit measures.
Given the error/residual vector, compute the quality of fit measures.
- e
the corresponding m-dimensional error vector (y - yp)
- w
the weights on the instances
- yp
the predicted response vector (x * b)
- Definition Classes
- Fit
e: VectoD
- Attributes
- protected
- Definition Classes
- Predictor
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
eval(xx: MatriD, yy: VectoD): Unit
Compute the error and useful diagnostics for the test dataset.
Compute the error and useful diagnostics for the test dataset.
- xx
the test data matrix
- yy
the test response vector
- Definition Classes
- PredictorMat → Predictor
eval(): Unit
Compute the error and useful diagnostics for the entire dataset.
Compute the error and useful diagnostics for the entire dataset.
- Definition Classes
- PredictorMat → Predictor
finalize(): Unit
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
fit: VectoD
Return the quality of fit including 'rSq', 'sst', 'sse', 'mse0', rmse', 'mae', 'df._2', 'rBarSq', 'fStat', 'aic', 'bic'.
Return the quality of fit including 'rSq', 'sst', 'sse', 'mse0', rmse', 'mae', 'df._2', 'rBarSq', 'fStat', 'aic', 'bic'. Note, if 'sse > sst', the model introduces errors and the 'rSq' may be negative, otherwise, R^2 ('rSq') ranges from 0 (weak) to 1 (strong). Note that 'rSq' is the number 5 measure. Override to add more quality of fit measures.
- Definition Classes
- Fit
fitLabel: Seq[String]
Return the labels for the quality of fit measures.
Return the labels for the quality of fit measures. Override to add more quality of fit measures.
- Definition Classes
- Fit
fitMap: Map[String, String]
Build a map of quality of fit measures (use of
makes it ordered).Build a map of quality of fit measures (use of
makes it ordered). Override to add more quality of fit measures.- Definition Classes
- Fit
flaw(method: String, message: String): Unit
- Definition Classes
- Error
fname: Strings
- Attributes
- protected
- Definition Classes
- PredictorMat
forwardSel(cols: Set[Int]): (Int, VectoD, VectoD)
Perform forward selection to add the most predictive variable to the existing model, returning the variable to add, the new parameter vector and the new quality of fit.
Perform forward selection to add the most predictive variable to the existing model, returning the variable to add, the new parameter vector and the new quality of fit. May be called repeatedly.
- cols
the columns of matrix x included in the existing model
gcv(yy: VectoD): Double
Find an optimal value for the shrinkage parameter 'λ' using Generalized Cross Validation (GCV).
Find an optimal value for the shrinkage parameter 'λ' using Generalized Cross Validation (GCV).
- yy
the response vector
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
hparameter: HyperParameter
Return the hyper-parameters.
Return the hyper-parameters.
- Definition Classes
- PredictorMat
index_rSq: Int
- Definition Classes
- Fit
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
k: Int
- Attributes
- protected
- Definition Classes
- PredictorMat
m: Int
- Attributes
- protected
- Definition Classes
- PredictorMat
mse_: Double
Return the mean of squares for error (sse / df._2).
Return the mean of squares for error (sse / df._2). Must call diagnose first.
- Definition Classes
- Fit
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
parameter: VectoD
Return the vector of parameter/coefficient values.
Return the vector of parameter/coefficient values.
- Definition Classes
- Predictor
predict(z: MatriD = x): VectoD
Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', for each row of matrix 'z'.
Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', for each row of matrix 'z'.
- z
the new matrix to predict
- Definition Classes
- PredictorMat
predict(z: VectoD): Double
Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', e.g., '(b_0, b_1, b_2) dot (1, z_1, z_2)'.
Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', e.g., '(b_0, b_1, b_2) dot (1, z_1, z_2)'.
- z
the new vector to predict
- Definition Classes
- PredictorMat → Predictor
predict(z: VectoI): Double
Given a new discrete data vector z, predict the y-value of f(z).
Given a new discrete data vector z, predict the y-value of f(z).
- z
the vector to use for prediction
- Definition Classes
- Predictor
resetDF(df_update: (Double, Double)): Unit
Reset the degrees of freedom to the new updated values.
Reset the degrees of freedom to the new updated values. For some models, the degrees of freedom is not known until after the model is built.
- df_update
the updated degrees of freedom
- Definition Classes
- Fit
residual: VectoD
Return the vector of residuals/errors.
Return the vector of residuals/errors.
- Definition Classes
- Predictor
sumCoeff(b: VectoD, stdErr: VectoD = null): String
Produce the summary report portion for the cofficients.
Produce the summary report portion for the cofficients.
- b
the parameters/coefficients for the model
- Definition Classes
- Fit
summary(): String
Compute and return summary diagostics for the regression model.
Compute and return summary diagostics for the regression model.
- Definition Classes
- PredictorMat
summary(b: VectoD, stdErr: VectoD = null, show: Boolean = false): String
Produce a summary report with diagnostics for each predictor 'x_j' and the overall quality of fit.
Produce a summary report with diagnostics for each predictor 'x_j' and the overall quality of fit.
- b
the parameters/coefficients for the model
- show
flag indicating whether to print the summary
- Definition Classes
- Fit
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
toString(): String
- Definition Classes
- AnyRef → Any
train(yy: VectoD = y): RidgeRegression
Train the predictor by fitting the parameter vector (b-vector) in the multiple regression equation
Train the predictor by fitting the parameter vector (b-vector) in the multiple regression equation
yy = b dot x + e = [b_1, ... b_k] dot [x_1, ... x_k] + e
using the least squares method.
- yy
the response vector
- Definition Classes
- RidgeRegression → PredictorMat → Predictor
train(): PredictorMat
Given a set of data vectors 'x's and their corresponding responses 'y's, passed into the implementing class, train the prediction function 'y = f(x)' by fitting its parameters.
Given a set of data vectors 'x's and their corresponding responses 'y's, passed into the implementing class, train the prediction function 'y = f(x)' by fitting its parameters.
- Definition Classes
- PredictorMat
train2(yy: VectoD = y): PredictorMat
- Definition Classes
- PredictorMat
vif: VectoD
Compute the Variance Inflation Factor 'VIF' for each variable to test for multi-collinearity by regressing 'xj' against the rest of the variables.
Compute the Variance Inflation Factor 'VIF' for each variable to test for multi-collinearity by regressing 'xj' against the rest of the variables. A VIF over 10 indicates that over 90% of the variance of 'xj' can be predicted from the other variables, so 'xj' is a candidate for removal from the model.
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
x: MatriD
- Attributes
- protected
- Definition Classes
- PredictorMat
xtx_λI(λ: Double): Unit
Compute x.t * x + λI.
Compute x.t * x + λI.
- λ
the shrinkage parameter
y: VectoD
- Attributes
- protected
- Definition Classes
- PredictorMat