Packages

class Regression[MatT <: MatriD, VecT <: VectoD] extends Predictor with Error

The Regression class supports multiple linear regression. In this case, 'x' is multi-dimensional [1, x_1, ... x_k]. Fit the parameter vector 'b' in the regression equation

y = b dot x + e = b_0 + b_1 * x_1 + ... b_k * x_k + e

where 'e' represents the residuals (the part not explained by the model). Use Least-Squares (minimizing the residuals) to fit the parameter vector

b = fac.solve (.)

Four factorization techniques are provided:

'QR' // QR Factorization: slower, more stable (default) 'Cholesky' // Cholesky Factorization: faster, less stable (reasonable choice) 'SVD' // Singular Value Decomposition: slowest, most robust 'LU' // LU Factorization: better than Inverse 'Inverse' // Inverse/Gaussian Elimination, classical textbook technique

See also

en.wikipedia.org/wiki/Degrees_of_freedom_(statistics)

see.stanford.edu/materials/lsoeldsee263/05-ls.pdf Note, not intended for use when the number of degrees of freedom 'df' is negative.

Linear Supertypes
Error, Predictor, AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Regression
  2. Error
  3. Predictor
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Regression(x: MatT, y: VecT, technique: RegTechnique = QR)

    x

    the input/data m-by-n matrix (augment with a first column of ones to include intercept in model)

    y

    the response n vector

    technique

    the technique used to solve for b in x.t*x*b = x.t*y

Type Members

  1. type Fac_QR = Fac_QR_H[MatT]

Value Members

  1. def backElim(): (Int, VectoD, VectorD)

    Perform backward elimination to remove the least predictive variable from the model, returning the variable to eliminate, the new parameter vector and the new quality of fit.

  2. def coefficient: VectoD

    Return the vector of coefficient/parameter values.

    Return the vector of coefficient/parameter values.

    Definition Classes
    Predictor
  3. def diagnose(yy: VectoD): Unit

    Compute diagostics for the regression model.

    Compute diagostics for the regression model.

    yy

    the response vector

    Definition Classes
    RegressionPredictor
  4. def fit: VectorD

    Return the quality of fit.

    Return the quality of fit.

    Definition Classes
    RegressionPredictor
  5. def fitLabels: Seq[String]

    Return the labels for the fit.

    Return the labels for the fit.

    Definition Classes
    RegressionPredictor
  6. final def flaw(method: String, message: String): Unit
    Definition Classes
    Error
  7. def predict(z: MatT): VectoD

    Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', for each row of matrix 'z'.

    Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', for each row of matrix 'z'.

    z

    the new matrix to predict

  8. def predict(z: VectoD): Double

    Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', e.g., '(b_0, b_1, b_2) dot (1, z_1, z_2)'.

    Predict the value of 'y = f(z)' by evaluating the formula 'y = b dot z', e.g., '(b_0, b_1, b_2) dot (1, z_1, z_2)'.

    z

    the new vector to predict

    Definition Classes
    RegressionPredictor
  9. def predict(z: VectoI): Double

    Given a new discrete data vector z, predict the y-value of f(z).

    Given a new discrete data vector z, predict the y-value of f(z).

    z

    the vector to use for prediction

    Definition Classes
    Predictor
  10. def report(): Unit

    Print results and diagnostics for each predictor 'x_j' and the overall quality of fit.

  11. def residual: VectoD

    Return the vector of residuals/errors.

    Return the vector of residuals/errors.

    Definition Classes
    Predictor
  12. def train(): Unit

    Train the predictor by fitting the parameter vector (b-vector) in the multiple regression equation for the response passed into the class 'y'.

    Train the predictor by fitting the parameter vector (b-vector) in the multiple regression equation for the response passed into the class 'y'.

    Definition Classes
    RegressionPredictor
  13. def train(yy: VectoD): Unit

    Train the predictor by fitting the parameter vector (b-vector) in the multiple regression equation

    Train the predictor by fitting the parameter vector (b-vector) in the multiple regression equation

    yy = b dot x + e = [b_0, ... b_k] dot [1, x_1 , ... x_k] + e

    using the ordinary least squares 'OLS' method.

    yy

    the response vector

    Definition Classes
    RegressionPredictor
  14. def vif: VectorD

    Compute the Variance Inflation Factor 'VIF' for each variable to test for multi-collinearity by regressing 'xj' against the rest of the variables.

    Compute the Variance Inflation Factor 'VIF' for each variable to test for multi-collinearity by regressing 'xj' against the rest of the variables. A VIF over 10 indicates that over 90% of the variance of 'xj' can be predicted from the other variables, so 'xj' is a candidate for removal from the model.