the centered input/design m-by-n matrix NOT augmented with a first column of ones
the centered response vector
the shrinkage parameter (0 => OLS) in the penalty term 'lambda * b dot b'
the technique used to solve for b in x.t*x*b = x.t*y
Perform backward elimination to remove the least predictive variable from the model, returning the variable to eliminate, the new parameter vector, the new R-squared value and the new F statistic.
Return the fit (parameter vector b, quality of fit including rSquared).
Show the flaw by printing the error message.
Show the flaw by printing the error message.
the method where the error occurred
the error message
Predict the value of y = f(z) by evaluating the formula y = b dot z for each row of matrix z.
Predict the value of y = f(z) by evaluating the formula y = b dot z for each row of matrix z.
the new matrix to predict
Predict the value of y = f(z) by evaluating the formula below.
Predict the value of y = f(z) by evaluating the formula below.
the new vector to predict
Given a new discrete data vector z, predict the y-value of f(z).
Given a new discrete data vector z, predict the y-value of f(z).
the vector to use for prediction
Retrain the predictor by fitting the parameter vector (b-vector) in the multiple regression equation yy = b dot x + e = [b_1, ...
Retrain the predictor by fitting the parameter vector (b-vector) in the multiple regression equation yy = b dot x + e = [b_1, ... b_k] dot [x_1, ... x_k] + e using the least squares method.
the new response vector
Train the predictor by fitting the parameter vector (b-vector) in the multiple regression equation y = b dot x + e = [b_1, ...
Train the predictor by fitting the parameter vector (b-vector) in the multiple regression equation y = b dot x + e = [b_1, ... b_k] dot [x_1, ... x_k] + e using the least squares method.
Compute the Variance Inflation Factor (VIF) for each variable to test for multi-colinearity by regressing xj against the rest of the variables.
Compute the Variance Inflation Factor (VIF) for each variable to test for multi-colinearity by regressing xj against the rest of the variables. A VIF over 10 indicates that over 90% of the varaince of xj can be predicted from the other variables, so xj is a candidate for removal from the model.
Compute x.t * x and add lambda to the diagonal
The
RidgeRegression
class supports multiple linear regression. In this case, 'x' is multi-dimensional [x_1, ... x_k]. Both the input matrix 'x' and the response vector 'y' are centered (zero mean). Fit the parameter vector 'b' in the regression equationy = b dot x + e = b_1 * x_1 + ... b_k * x_k + e
where 'e' represents the residuals (the part not explained by the model). Use Least-Squares (minimizing the residuals) to fit the parameter vector
b = x_pinv * y [ alternative: b = solve (y) ]
where 'x_pinv' is the pseudo-inverse. Three techniques are provided:
Fac_QR // QR Factorization: slower, more stable (default) Fac_Cholesky // Cholesky Factorization: faster, less stable (reasonable choice) Inverse // Inverse/Gaussian Elimination, classical textbook technique (outdated)
This version uses parallel processing to speed up execution. see http://statweb.stanford.edu/~tibs/ElemStatLearn/