Packages

c

scalation.analytics

TranRegression

class TranRegression extends Predictor with Error

The TranRegression class supports transformed multiple linear regression. In this case, 'x' is multi-dimensional [1, x_1, ... x_k]. Fit the parameter vector 'b' in the transformed regression equation

transform (y) = b dot x + e = b_0 + b_1 * x_1 + b_2 * x_2 ... b_k * x_k + e

where 'e' represents the residuals (the part not explained by the model) and 'transform' is the function (defaults to log) used to transform the response vector 'y'. Common transforms: log (y), sqrt (y) when y > 0 More generally, a Box-Cox Transformation may be applied.

See also

www.ams.sunysb.edu/~zhu/ams57213/Team3.pptx

citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.469.7176&rep=rep1&type=pdf Use Least-Squares (minimizing the residuals) to fit the parameter vector b = x_pinv * y where 'x_pinv' is the pseudo-inverse. Caveat: this class does not provide transformations on columns of matrix 'x'.

Linear Supertypes
Error, Predictor, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. TranRegression
  2. Error
  3. Predictor
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new TranRegression(x: MatrixD, y: VectorD, transform: FunctionS2S = log, technique: RegTechnique = QR)

    x

    the design/data matrix

    y

    the response vector

    transform

    the transformation function (defaults to log)

    technique

    the technique used to solve for b in x.t*x*b = x.t*y

Value Members

  1. def backElim(): (Int, VectoD, VectorD)

    Perform backward elimination to remove the least predictive variable from the model, returning the variable to eliminate, the new parameter vector, the new R-squared value and the new F statistic.

  2. def coefficient: VectoD

    Return the vector of coefficient/parameter values.

    Return the vector of coefficient/parameter values.

    Definition Classes
    Predictor
  3. def diagnose(yy: VectoD): Unit

    Compute diagostics for the predictor.

    Compute diagostics for the predictor. Override to add more diagostics. Note, for 'rmse', 'sse' is divided by the number of instances 'm' rather than degrees of freedom.

    yy

    the response vector

    Definition Classes
    Predictor
    See also

    en.wikipedia.org/wiki/Mean_squared_error

  4. def fit: VectorD

    Return the quality of fit.

    Return the quality of fit.

    Definition Classes
    TranRegressionPredictor
  5. def fitLabels: Seq[String]

    Return the labels for the fit.

    Return the labels for the fit.

    Definition Classes
    TranRegressionPredictor
  6. final def flaw(method: String, message: String): Unit
    Definition Classes
    Error
  7. def predict(z: VectoD): Double

    Predict the value of y = f(z) by evaluating the formula y = b dot z, e.g., (b_0, b_1, b_2) dot (1, z_1, z_2).

    Predict the value of y = f(z) by evaluating the formula y = b dot z, e.g., (b_0, b_1, b_2) dot (1, z_1, z_2).

    z

    the new vector to predict

    Definition Classes
    TranRegressionPredictor
  8. def predict(z: VectoI): Double

    Given a new discrete data vector z, predict the y-value of f(z).

    Given a new discrete data vector z, predict the y-value of f(z).

    z

    the vector to use for prediction

    Definition Classes
    Predictor
  9. def residual: VectoD

    Return the vector of residuals/errors.

    Return the vector of residuals/errors.

    Definition Classes
    TranRegressionPredictor
  10. val rg: Regression[MatrixD, VectorD]
  11. def train(): Unit

    Train the predictor by fitting the parameter vector (b-vector) in the regression equation on 'yt'.

    Train the predictor by fitting the parameter vector (b-vector) in the regression equation on 'yt'.

    Definition Classes
    TranRegressionPredictor
  12. def train(yy: VectoD): Unit

    Retrain the predictor by fitting the parameter vector (b-vector) in the multiple regression equation

    Retrain the predictor by fitting the parameter vector (b-vector) in the multiple regression equation

    yy = b dot x + e = [b_0, ... b_k] dot [1, x_1, x_2 ... x_k] + e

    using the least squares method.

    yy

    the response vector

    Definition Classes
    TranRegressionPredictor
  13. def vif: VectorD

    Compute the Variance Inflation Factor 'VIF' for each variable to test for multi-collinearity by regressing 'xj against the rest of the variables.

    Compute the Variance Inflation Factor 'VIF' for each variable to test for multi-collinearity by regressing 'xj against the rest of the variables. A VIF over 10 indicates that over 90% of the variance of 'xj' can be predicted from the other variables, so 'xj' is a candidate for removal from the model.

  14. val yt: VectorD