Regression/classification using lpls and cross - validation with potential jackknife variable selection and optional refitting of model to selected variables.
lplsReg.cv(X1, X2, X3, npc.sel = 1:5, alphavek = seq(0, 1, by = 0.2), npc.ref = NULL, testlevel = 0.05, dreduce = F, colcent = c(T, T), rowcent = c(F, F), grandcent = c(F, F), folds, err.eval.type = "rate", cvreport = TRUE)
X1 | A response vector or matrix for regression. For classification this should be either a factor or a dummy coded 0/1 matrix with one column per group. |
---|---|
X2 | Predictor matrix of size (n x p). |
X3 | Background information matrix of size (m x p) |
npc.sel | A vector of component numbers to be tested in the initial LPLS model based on all variables in the inner CV - loop. Default is 1:5. |
alphavek | A vector of alpha - values to be tested in the initial LPLS
model based on all variables in the inner CV - loop. Default is a single
value 0. See |
npc.ref | A vector of component numbers to be tested in the re - fitted LPLS model based on selected variables in the inner CV - loop. Default is NULL which gives no refitting. |
testlevel | Testlevel for the jackknife testing of the variables. Deafult is 0.05 |
dreduce | Logical. Should variable selection on the columns of X3 (parallel to X2) also be applied to the rows of X3? This is logical only if X3 is a (p x p) matrix expressing some dependency or simlarity between the variables in X2, hence, in cases where both the rows and columns of X3 relate to the variables of X2. |
colcent | Logical vector of length referring to X2 and X3. Should column centering be performed? |
rowcent | Logical vector of length referring to X2 and X3. Should row centering be performed? |
grandcent | Logical vector of length referring to X2 and X3. Should overall centering be performed? |
folds | A list of length |
err.eval.type | The evaluation criterion for prediction/classification performance. Either "rate" (total error rate), "rmsep" (root mean square error), or "rmsep2" a modified rmsep where only predictions between 0 and 1 contribute to the error. Predictions outside this range are considered as perfect predictions. |
cvreport | Logical. Should an iteration report be printed on screen during the computations? |
An array holding predicted X1 - values for each number of components (initial model and refitted) and alpha values.
The CV - segments used.
An array holding all estimated regression coefficients for all components (initial model) and alphavalues.
The standard deviations of the regressions coefficients.
For clasification:True class of sample
The p - values from jackknife testing of each regression coefficient for all levels of components and alpha.
For clasification:The posterior probability of each sample to belong to each class in case of classification.
For clasification:The predicted class of each sample for all levels of components and alpha.
The total error (as defined by argument err.eval.type
for all level of components and alpha.
An array of logicals defining wether a variable is found to be significant or not. Significance is given for all levels of components and alpha,
data(BCdata) segs <- balanced.folds(BCdata$Y, 5) fit.cv <- lplsReg.cv(factor(BCdata$Y), BCdata$X, BCdata$Z, folds = segs)#> Segment 1 of 5 completed #> Segment 2 of 5 completed #> Segment 3 of 5 completed #> Segment 4 of 5 completed #> Segment 5 of 5 completed