This is a convenience wrapper around
regmest_cv(), for the common use-case of computing
a highly-robust S-estimate followed by a more efficient M-estimate using the scale of the residuals from the
pensem_cv(x, ...) # S3 method for default pensem_cv( x, y, alpha = 0.5, nlambda = 50, lambda_min_ratio, lambda_m, lambda_s, standardize = TRUE, penalty_loadings, intercept = TRUE, bdp = 0.25, ncores = 1, sparse = FALSE, eps = 1e-06, cc = 4.7, cv_k = 5, cv_repl = 1, cl = NULL, cv_metric = c("tau_size", "mape", "rmspe"), add_zero_based = TRUE, explore_solutions = 10, explore_tol = 0.1, max_solutions = 10, fit_all = TRUE, comparison_tol = sqrt(eps), algorithm_opts = mm_algorithm_options(), mscale_opts = mscale_algorithm_options(), nlambda_enpy = 10, enpy_opts = enpy_options(), ... ) # S3 method for pense_cvfit pensem_cv( x, scale, alpha, nlambda = 50, lambda_min_ratio, lambda_m, standardize = TRUE, penalty_loadings, intercept = TRUE, bdp = 0.25, ncores = 1, sparse = FALSE, eps = 1e-06, cc = 4.7, cv_k = 5, cv_repl = 1, cl = NULL, cv_metric = c("tau_size", "mape", "rmspe"), add_zero_based = TRUE, explore_solutions = 10, explore_tol = 0.1, max_solutions = 10, fit_all = TRUE, comparison_tol = sqrt(eps), algorithm_opts = mm_algorithm_options(), mscale_opts = mscale_algorithm_options(), x_train, y_train, ... )
either a numeric matrix of predictor values, or a cross-validated PENSE fit from
ignored. See the section on deprecated parameters below.
vector of response values of length
elastic net penalty mixing parameter with \(0 \le \alpha \le 1\).
number of penalization levels.
Smallest value of the penalization level as a fraction of the largest level (i.e., the
smallest value for which all coefficients are zero). The default depends on the sample
size relative to the number of variables and
optional user-supplied sequence of penalization levels for the S- and M-estimates.
If given and not
logical flag to standardize the
a vector of positive penalty loadings (a.k.a. weights) for different penalization of each
coefficient. Only allowed for
include an intercept in the model.
desired breakdown point of the estimator, between 0 and 0.5.
number of CPU cores to use in parallel. By default, only one CPU core is used. May not be supported on your platform, in which case a warning is given.
use sparse coefficient vectors.
cutoff constant for Tukey's bisquare \(\rho\) function in the M-estimation objective function.
number of folds per cross-validation.
number of cross-validation replications.
a parallel cluster. Can only be used if
either a string specifying the performance metric to use, or a function to evaluate prediction errors in a single CV replication. If a function, the number of arguments define the data the function receives. If the function takes a single argument, it is called with a single numeric vector of prediction errors. If the function takes two or more arguments, it is called with the predicted values as first argument and the true values as second argument. The function must always return a single numeric value quantifying the prediction performance. The order of the given values corresponds to the order in the input data.
also consider the 0-based regularization path. See details for a description.
number of solutions to compute up to the desired precision
numerical tolerance and maximum number of iterations for exploring possible solutions.
The tolerance should be (much) looser than
only retain up to
numeric tolerance to determine if two solutions are equal. The comparison is first done
on the absolute difference in the value of the objective function at the solution
If this is less than
options for the MM algorithm to compute the estimates. See
options for the M-scale estimation. See
number of penalization levels where the EN-PY initial estimate is computed.
initial scale estimate to use in the M-estimation. By default the S-scale from the PENSE fit is used.
an object of cross-validated regularized M-estimates as returned from
The built-in CV metrics are
\(\tau\)-size of the prediction error, computed by
Median absolute prediction error.
Root mean squared prediction error.
Area under the receiver operator characteristic curve (actually 1 - AUROC). Only sensible for binary responses.