the data/design matrix of continuous variables
the treatment/categorical variable vector
the response vector
the number of treatment levels (1, ... levels)
the technique used to solve for b in x.t*x*b = x.t*y
Assign values for the dummy variables based on the treatment vector 't'.
Assign values for the continuous variables from the 'x' matrix.
Perform backward elimination to remove the least predictive variable from the model, returning the variable to eliminate, the new parameter vector, the new R-squared value and the new F statistic.
Return the fit (parameter vector b, quality of fit rSquared).
Show the flaw by printing the error message.
Show the flaw by printing the error message.
the method where the error occurred
the error message
Predict the value of y = f(z) by evaluating the formula y = b dot zi for each row zi of matrix z.
Predict the value of y = f(z) by evaluating the formula y = b dot z, e.g., (b0, b1, b2) dot (1, z1, z2).
Given a new discrete data vector z, predict the y-value of f(z).
Given a new discrete data vector z, predict the y-value of f(z).
the vector to use for prediction
Retrain the predictor by fitting the parameter vector (b-vector) in the multiple regression equation yy = b dot x + e = [b_0, ...
Retrain the predictor by fitting the parameter vector (b-vector) in the multiple regression equation yy = b dot x + e = [b_0, ... b_k+l] dot [1, x_1, ..., d_1, ...] + e using the least squares method.
the new response vector
Train the predictor by fitting the parameter vector (b-vector) in the regression equation y = b dot x + e = [b_0, ...
Compute the Variance Inflation Factor (VIF) for each variable to test for multi-colinearity by regressing xj against the rest of the variables.
Compute the Variance Inflation Factor (VIF) for each variable to test for multi-colinearity by regressing xj against the rest of the variables. A VIF over 10 indicates that over 90% of the varaince of xj can be predicted from the other variables, so xj is a candidate for removal from the model.
The
ANCOVA
class supports ANalysis of COVAraiance (ANCOVA). It allows the addition of a categorical treatment variable 't' into a multiple linear regression. This is done by introducing dummy variables 'dj' to distinguish the treatment level. The problem is again to fit the parameter vector 'b' in the augmented regression equationy = b dot x + e = b0 + b_1 * x_1 + b_2 * x_2 + ... b_k * x_k + b_k+1 * d_1 + b_k+2 * d_2 + ... b_k+l * d_l + e
where 'e' represents the residuals (the part not explained by the model). Use Least-Squares (minimizing the residuals) to fit the parameter vector
b = x_pinv * y
where 'x_pinv' is the pseudo-inverse.
see.stanford.edu/materials/lsoeldsee263/05-ls.pdf