The RegressionMV
class supports multi-variate multiple linear regression. In this case, x is multi-dimensional [1, x_1, ... x_k] and y is multi-dimensional [y_0, ... y_l]. Fit the parameter vector b in for each regression equation y = b dot x + e = b_0 + b_1 * x_1 + ... b_k * x_k + e where e represents the residuals (the part not explained by the model). Use Least-Squares (minimizing the residuals) to solve the parameter vector b using the Normal Equations: x.t * x * b = x.t * y b = fac.solve (.) Five factorization algorithms are provided: Fac_QR
QR Factorization: slower, more stable (default) Fac_SVD
Singular Value Decomposition: slowest, most robust Fac_Cholesky
Cholesky Factorization: faster, less stable (reasonable choice) Fac_LU' LU Factorization: better than Inverse
Fac_Inverse` Inverse Factorization: textbook approach
Value parameters
- fname_
-
the feature/variable names (defaults to null)
- hparam
-
the hyper-parameters (defaults to Regression.hp)
- x
-
the data/input m-by-n matrix (augment with a first column of ones to include intercept in model)
- y
-
the response/output m-by-ny matrix
Attributes
- See also
-
see.stanford.edu/materials/lsoeldsee263/05-ls.pdf Note, not intended for use when the number of degrees of freedom 'df' is negative.
- Companion
- object
- Graph
-
- Supertypes
Members list
Type members
Inherited classlikes
The BestStep
is used to record the best improvement step found so far. Only considers the first response variable y(0) => qof(?, 0).
The BestStep
is used to record the best improvement step found so far. Only considers the first response variable y(0) => qof(?, 0).
Value parameters
- col
-
the column/variable to ADD/REMOVE for this step
- mod
-
the model including selected features/variables for this step
- qof
-
the Quality of Fit (QoF) for this step
Attributes
- Inherited from:
- PredictorMV
- Supertypes
-
trait Serializabletrait Producttrait Equalsclass Objecttrait Matchableclass AnyShow all
Value members
Concrete methods
Build a sub-model that is restricted to the given columns of the data matrix.
Build a sub-model that is restricted to the given columns of the data matrix.
Value parameters
- x_cols
-
the columns that the new model is restricted to
Attributes
Predict the value of y = f(z) by evaluating the formula y = b dot z, e.g., (b_0, b_1, b_2) dot (1, z_1, z_2).
Predict the value of y = f(z) by evaluating the formula y = b dot z, e.g., (b_0, b_1, b_2) dot (1, z_1, z_2).
Value parameters
- z
-
the new vector to predict
Attributes
Predict the value of matrix y = f(x_, b). It is overridden for speed.
Predict the value of matrix y = f(x_, b). It is overridden for speed.
Value parameters
- x_
-
the matrix to use for making predictions, one for each row
Attributes
- Definition Classes
Produce a QoF summary for a model with diagnostics for each predictor 'x_j' and the overall Quality of Fit (QoF).
Produce a QoF summary for a model with diagnostics for each predictor 'x_j' and the overall Quality of Fit (QoF).
Value parameters
- b_
-
the parameters/coefficients for the model
- fname_
-
the array of feature/variable names
- vifs
-
the Variance Inflation Factors (VIFs)
- x_
-
the testing/full data/input matrix
Attributes
- Definition Classes
Test a predictive model y_ = f(x_) + e and return its QoF vector. Testing may be be in-sample (on the training set) or out-of-sample (on the testing set) as determined by the parameters passed in. Note: must call train before test.
Test a predictive model y_ = f(x_) + e and return its QoF vector. Testing may be be in-sample (on the training set) or out-of-sample (on the testing set) as determined by the parameters passed in. Note: must call train before test.
Value parameters
- x_
-
the testing/full data/input matrix (defaults to full x)
- y_
-
the testing/full response/output matrix (defaults to full y)
Attributes
Train the predictor by fitting the parameter vector (b-vector) in the multiple regression equation y = b dot x + e = [b_0, ... b_k] dot [1, x_1 , ... x_k] + e using the ordinary least squares 'OLS' method.
Train the predictor by fitting the parameter vector (b-vector) in the multiple regression equation y = b dot x + e = [b_0, ... b_k] dot [1, x_1 , ... x_k] + e using the ordinary least squares 'OLS' method.
Value parameters
- x_
-
the training/full data/input matrix
- y_
-
the training/full response/output matrix
Attributes
Inherited methods
Perform backward elimination to find the least predictive variable to remove from the existing model, returning the variable to eliminate, the new parameter matrix and the new Quality of Fit (QoF). May be called repeatedly.
Perform backward elimination to find the least predictive variable to remove from the existing model, returning the variable to eliminate, the new parameter matrix and the new Quality of Fit (QoF). May be called repeatedly.
Value parameters
- cols
-
the columns of matrix x currently included in the existing model
- first
-
first variable to consider for elimination (default (1) assume intercept x_0 will be in any model)
- idx_q
-
index of Quality of Fit (QoF) to use for comparing quality
Attributes
- See also
-
Fit
for index of QoF measures. - Inherited from:
- PredictorMV
Perform backward elimination to find the least predictive variables to remove from the full model, returning the variables left and the new Quality of Fit (QoF) measures for all steps.
Perform backward elimination to find the least predictive variables to remove from the full model, returning the variables left and the new Quality of Fit (QoF) measures for all steps.
Value parameters
- cross
-
whether to include the cross-validation QoF measure
- first
-
first variable to consider for elimination
- idx_q
-
index of Quality of Fit (QoF) to use for comparing quality
Attributes
- See also
-
Fit
for index of QoF measures. - Inherited from:
- PredictorMV
Attributes
- Inherited from:
- PredictorMV
Diagnose the health of the model by computing the Quality of Fit (QoF) measures, from the error/residual vector and the predicted & actual responses. For some models the instances may be weighted.
Diagnose the health of the model by computing the Quality of Fit (QoF) measures, from the error/residual vector and the predicted & actual responses. For some models the instances may be weighted.
Value parameters
- w
-
the weights on the instances (defaults to null)
- y
-
the actual response/output vector to use (test/full)
- yp
-
the predicted response/output vector (test/full)
Attributes
- See also
-
Regression_WLS
- Definition Classes
- Inherited from:
- Fit
Return the Quality of Fit (QoF) measures corresponding to the labels given. Note, if sse > sst, the model introduces errors and the rSq may be negative, otherwise, R^2 (rSq) ranges from 0 (weak) to 1 (strong). Override to add more quality of fit measures.
Return the Quality of Fit (QoF) measures corresponding to the labels given. Note, if sse > sst, the model introduces errors and the rSq may be negative, otherwise, R^2 (rSq) ranges from 0 (weak) to 1 (strong). Override to add more quality of fit measures.
Attributes
- Inherited from:
- Fit
Perform forward selection to find the most predictive variable to add the existing model, returning the variable to add and the new model. May be called repeatedly.
Perform forward selection to find the most predictive variable to add the existing model, returning the variable to add and the new model. May be called repeatedly.
Value parameters
- cols
-
the columns of matrix x currently included in the existing model
- idx_q
-
index of Quality of Fit (QoF) to use for comparing quality
Attributes
- See also
-
Fit
for index of QoF measures. - Inherited from:
- PredictorMV
Perform forward selection to find the most predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.
Perform forward selection to find the most predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.
Value parameters
- cross
-
whether to include the cross-validation QoF measure
- idx_q
-
index of Quality of Fit (QoF) to use for comparing quality
Attributes
- See also
-
Fit
for index of QoF measures. - Inherited from:
- PredictorMV
Return the best model found from feature selection.
Return the feature/variable names.
Return the used data matrix x. Mainly for derived classes where x is expanded from the given columns in x_.
Return the used data matrix x. Mainly for derived classes where x is expanded from the given columns in x_.
Attributes
- Inherited from:
- PredictorMV
Return the used response matrix y. Mainly for derived classes where y is transformed.
Return the used response matrix y. Mainly for derived classes where y is transformed.
Attributes
- Inherited from:
- PredictorMV
Return the help string that describes the Quality of Fit (QoF) measures provided by the Fit
trait. Override to correspond to fitLabel.
Return the help string that describes the Quality of Fit (QoF) measures provided by the Fit
trait. Override to correspond to fitLabel.
Attributes
- Inherited from:
- Fit
Return the hyper-parameters.
The log-likelihood function times -2. Override as needed.
The log-likelihood function times -2. Override as needed.
Value parameters
- ms
-
raw Mean Squared Error
- s2
-
MLE estimate of the population variance of the residuals
Attributes
- See also
- Inherited from:
- Fit
Make plots for each output/response variable (column of matrix y). Must override if the response matrix is transformed or rescaled.
Make plots for each output/response variable (column of matrix y). Must override if the response matrix is transformed or rescaled.
Value parameters
- yp
-
the testing/full predicted response/output matrix (defaults to full y)
- yy
-
the testing/full actual response/output matrix (defaults to full y)
Attributes
- Inherited from:
- PredictorMV
Return the mean of the squares for error (sse / df). Must call diagnose first.
Return the mean of the squares for error (sse / df). Must call diagnose first.
Attributes
- Inherited from:
- Fit
Return the number of terms/parameters in the model, e.g., b_0 + b_1 x_1 + b_2 x_2 has three terms.
Return the number of terms/parameters in the model, e.g., b_0 + b_1 x_1 + b_2 x_2 has three terms.
Attributes
- Inherited from:
- PredictorMV
Order vectors y_ and yp_ accroding to the ascending order of y_.
Order vectors y_ and yp_ accroding to the ascending order of y_.
Value parameters
- y_
-
the vector to order by (e.g., true response values)
- yp_
-
the vector to be order by y_ (e.g., predicted response values)
Attributes
- Inherited from:
- PredictorMV
Return only the first matrix of parameter/coefficient values.
Return only the first matrix of parameter/coefficient values.
Attributes
- Inherited from:
- PredictorMV
Return the array of network parameters (weight matrix, bias vector) bb.
Return the array of network parameters (weight matrix, bias vector) bb.
Attributes
- Inherited from:
- PredictorMV
Return the coefficient of determination (R^2). Must call diagnose first.
Return the coefficient of determination (R^2). Must call diagnose first.
Attributes
- Inherited from:
- FitM
Return a basic report on a trained and tested multi-variate model.
Return a basic report on a trained and tested multi-variate model.
Value parameters
- ftMat
-
the matrix of qof values produced by the
Fit
trait
Attributes
- Definition Classes
-
PredictorMV -> Model
- Inherited from:
- PredictorMV
Return a basic report on a trained and tested model.
Return a basic report on a trained and tested model.
Value parameters
- ftVec
-
the vector of qof values produced by the
Fit
trait
Attributes
- Inherited from:
- Model
Reset the best-step to default
Reset the degrees of freedom to the new updated values. For some models, the degrees of freedom is not known until after the model is built.
Reset the degrees of freedom to the new updated values. For some models, the degrees of freedom is not known until after the model is built.
Value parameters
- df_update
-
the updated degrees of freedom (model, error)
Attributes
- Inherited from:
- Fit
Return the matrix of residuals/errors.
Perform feature selection to find the most predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.
Perform feature selection to find the most predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.
Value parameters
- cross
-
whether to include the cross-validation QoF measure
- idx_q
-
index of Quality of Fit (QoF) to use for comparing quality
- tech
-
the feature selection technique to apply
Attributes
- See also
-
Fit
for index of QoF measures. - Inherited from:
- PredictorMV
Return the sum of the squares for error (sse). Must call diagnose first.
Return the sum of the squares for error (sse). Must call diagnose first.
Attributes
- Inherited from:
- FitM
Perform stepwise regression to find the most predictive variables to have in the model, returning the variables left and the new Quality of Fit (QoF) measures for all steps. At each step it calls 'forwardSel' and 'backwardElim' and takes the best of the two actions. Stops when neither action yields improvement.
Perform stepwise regression to find the most predictive variables to have in the model, returning the variables left and the new Quality of Fit (QoF) measures for all steps. At each step it calls 'forwardSel' and 'backwardElim' and takes the best of the two actions. Stops when neither action yields improvement.
Value parameters
- cross
-
whether to include the cross-validation QoF measure
- idx_q
-
index of Quality of Fit (QoF) to use for comparing quality
Attributes
- See also
-
Fit
for index of QoF measures. - Inherited from:
- PredictorMV
Test/evaluate the model's Quality of Fit (QoF) and return the predictions and QoF vectors. This may include the importance of its parameters (e.g., if 0 is in a parameter's confidence interval, it is a candidate for removal from the model). Extending traits and classess should implement various diagnostics for the test and full (training + test) datasets.
Test/evaluate the model's Quality of Fit (QoF) and return the predictions and QoF vectors. This may include the importance of its parameters (e.g., if 0 is in a parameter's confidence interval, it is a candidate for removal from the model). Extending traits and classess should implement various diagnostics for the test and full (training + test) datasets.
Value parameters
- x_
-
the testiing/full data/input matrix (impl. classes may default to x)
- y_
-
the testiing/full response/output vector (impl. classes may default to y)
Attributes
- Inherited from:
- PredictorMV
Return the indices for the test-set.
Return the indices for the test-set.
Value parameters
- n_test
-
the size of test-set
- rando
-
whether to select indices randomly or in blocks
Attributes
- See also
-
scalation.mathstat.TnT_Split
- Inherited from:
- PredictorMV
Train the model 'y_ = f(x_) + e' on a given dataset, by optimizing the model parameters in order to minimize error '||e||' or maximize log-likelihood 'll'.
Train the model 'y_ = f(x_) + e' on a given dataset, by optimizing the model parameters in order to minimize error '||e||' or maximize log-likelihood 'll'.
Value parameters
- x_
-
the training/full data/input matrix (impl. classes may default to x)
- y_
-
the training/full response/output vector (impl. classes may default to y)
Attributes
- Inherited from:
- PredictorMV
The train2 method should work like the train method, but should also optimize hyper-parameters (e.g., shrinkage or learning rate). Only implementing classes needing this capability should override this method.
The train2 method should work like the train method, but should also optimize hyper-parameters (e.g., shrinkage or learning rate). Only implementing classes needing this capability should override this method.
Value parameters
- x_
-
the training/full data/input matrix (defaults to full x)
- y_
-
the training/full response/output matrix (defaults to full y)
Attributes
- Inherited from:
- PredictorMV
Train and test the predictive model y_ = f(x_) + e and report its QoF and plot its predictions. FIX - currently must override if y is transformed, @see TranRegression
Train and test the predictive model y_ = f(x_) + e and report its QoF and plot its predictions. FIX - currently must override if y is transformed, @see TranRegression
Value parameters
- x_
-
the training/full data/input matrix (defaults to full x)
- xx
-
the testing/full data/input matrix (defaults to full x)
- y_
-
the training/full response/output matrix (defaults to full y)
- yy
-
the testing/full response/output matrix (defaults to full y)
Attributes
- Inherited from:
- PredictorMV
Train and test the predictive model y_ = f(x_) + e and report its QoF and plot its predictions. This version does auto-tuning. FIX - currently must override if y is transformed, @see TranRegression
Train and test the predictive model y_ = f(x_) + e and report its QoF and plot its predictions. This version does auto-tuning. FIX - currently must override if y is transformed, @see TranRegression
Value parameters
- x_
-
the training/full data/input matrix (defaults to full x)
- xx
-
the testing/full data/input matrix (defaults to full x)
- y_
-
the training/full response/output matrix (defaults to full y)
- yy
-
the testing/full response/output matrix (defaults to full y)
Attributes
- Inherited from:
- PredictorMV
Attributes
- Inherited from:
- PredictorMV
Compute the Variance Inflation Factor (VIF) for each variable to test for multi-collinearity by regressing x_j against the rest of the variables. A VIF over 50 indicates that over 98% of the variance of x_j can be predicted from the other variables, so x_j may be a candidate for removal from the model. Note: override this method to use a superior regression technique.
Compute the Variance Inflation Factor (VIF) for each variable to test for multi-collinearity by regressing x_j against the rest of the variables. A VIF over 50 indicates that over 98% of the variance of x_j can be predicted from the other variables, so x_j may be a candidate for removal from the model. Note: override this method to use a superior regression technique.
Value parameters
- skip
-
the number of columns of x at the beginning to skip in computing VIF
Attributes
- Inherited from:
- PredictorMV
Inherited fields
The optional reference to an ontological concept