R/regmest_regression.R
regmest_cv.Rd
Perform (repeated) K-fold cross-validation for regmest()
.
adamest_cv()
is a convenience wrapper to compute adaptive elastic-net M-estimates.
regmest_cv(
x,
y,
standardize = TRUE,
lambda,
cv_k,
cv_repl = 1,
cv_metric = c("tau_size", "mape", "rmspe", "auroc"),
fit_all = TRUE,
cl = NULL,
...
)
adamest_cv(x, y, alpha, alpha_preliminary = 0, exponent = 1, ...)
n
by p
matrix of numeric predictors.
vector of response values of length n
.
For binary classification, y
should be a factor with 2 levels.
whether to standardize the x
variables prior to fitting the PENSE estimates.
Can also be set to "cv_only"
, in which case the input data is not standardized, but the
training data in the CV folds is scaled to match the scaling of the input data.
Coefficients are always returned on the original scale.
This can fail for variables with a large proportion of a single value
(e.g., zero-inflated data).
In this case, either compute with standardize = FALSE
or standardize the data manually.
optional user-supplied sequence of penalization levels.
If given and not NULL
, nlambda
and lambda_min_ratio
are ignored.
number of folds per cross-validation.
number of cross-validation replications.
either a string specifying the performance metric to use, or a function to evaluate prediction errors in a single CV replication. If a function, the number of arguments define the data the function receives. If the function takes a single argument, it is called with a single numeric vector of prediction errors. If the function takes two or more arguments, it is called with the predicted values as first argument and the true values as second argument. The function must always return a single numeric value quantifying the prediction performance. The order of the given values corresponds to the order in the input data.
If TRUE
, fit the model for all penalization levels.
Can also be any combination of "min"
and "{x}-se"
, in which case only models at the
penalization level with smallest average CV accuracy, or within {x}
standard errors,
respectively.
Setting fit_all
to FALSE
is equivalent to "min"
.
Applies to all alpha
value.
a parallel cluster. Can only be used in combination with
ncores = 1
.
Arguments passed on to regmest
scale
fixed scale of the residuals.
nlambda
number of penalization levels.
lambda_min_ratio
Smallest value of the penalization level as a fraction of the
largest level (i.e., the smallest value for which all coefficients are zero).
The default depends on the sample size relative to the number of variables and alpha
.
If more observations than variables are available, the default is 1e-3 * alpha
,
otherwise 1e-2 * alpha
.
penalty_loadings
a vector of positive penalty loadings (a.k.a. weights)
for different penalization of each coefficient. Only allowed for alpha
> 0.
starting_points
a list of staring points, created by starting_point()
.
The starting points are shared among all penalization levels.
intercept
include an intercept in the model.
add_zero_based
also consider the 0-based regularization path in addition to the given starting points.
cc
cutoff constant for Tukey's bisquare \(\rho\) function.
eps
numerical tolerance.
explore_solutions
number of solutions to compute up to the desired precision eps
.
explore_tol
numerical tolerance for exploring possible solutions.
Should be (much) looser than eps
to be useful.
max_solutions
only retain up to max_solutions
unique solutions per penalization level.
comparison_tol
numeric tolerance to determine if two solutions are equal.
The comparison is first done on the absolute difference in the value of the objective
function at the solution.
If this is less than comparison_tol
, two solutions are deemed equal if the
squared difference of the intercepts is less than comparison_tol
and the squared
\(L_2\) norm of the difference vector is less than comparison_tol
.
sparse
use sparse coefficient vectors.
ncores
number of CPU cores to use in parallel. By default, only one CPU core is used. Not supported on all platforms, in which case a warning is given.
algorithm_opts
options for the MM algorithm to compute estimates.
See mm_algorithm_options()
for details.
mscale_bdp,mscale_opts
options for the M-scale estimate used to standardize
the predictors (if standardize = TRUE
).
elastic net penalty mixing parameter with \(0 \le \alpha \le 1\).
alpha = 1
is the LASSO penalty, and alpha = 0
the Ridge penalty.
alpha
parameter for the preliminary estimate.
the exponent for computing the penalty loadings based on the preliminary estimate.
a list-like object as returned by regmest()
, plus the following components:
cvres
data frame of average cross-validated performance.
a list-like object as returned by adamest_cv()
plus the following components:
exponent
value of the exponent.
preliminary
CV results for the preliminary estimate.
penalty_loadings
penalty loadings used for the adaptive elastic net M-estimate.
The built-in CV metrics are
"tau_size"
\(\tau\)-size of the prediction error, computed by
tau_size()
(default).
"mape"
Median absolute prediction error.
"rmspe"
Root mean squared prediction error.
"auroc"
Area under the receiver operator characteristic curve (actually 1 - AUROC). Only sensible for binary responses.
adamest_cv()
is a convenience wrapper which performs 3 steps:
compute preliminary estimates via regmest_cv(..., alpha = alpha_preliminary)
,
computes the penalty loadings from the estimate beta
with best prediction performance by
adamest_loadings = 1 / abs(beta)^exponent
, and
compute the adaptive PENSE estimates via
regmest_cv(..., penalty_loadings = adamest_loadings)
.
regmest()
for computing regularized S-estimates without cross-validation.
coef.pense_cvfit()
for extracting coefficient estimates.
plot.pense_cvfit()
for plotting the CV performance or the regularization path.
Other functions to compute robust estimates with CV:
pense_cv()
,
pensem_cv()
Other functions to compute robust estimates with CV:
pense_cv()
,
pensem_cv()
# Compute the adaptive PENSE regularization path for Freeny's
# revenue data (see ?freeny)
data(freeny)
x <- as.matrix(freeny[ , 2:5])
## Either use the convenience function directly ...
set.seed(123)
ada_convenience <- adapense_cv(x, freeny$y, alpha = 0.5,
cv_repl = 2, cv_k = 4)
## ... or compute the steps manually:
# Step 1: Compute preliminary estimates with CV
set.seed(123)
preliminary_estimate <- pense_cv(x, freeny$y, alpha = 0,
cv_repl = 2, cv_k = 4)
plot(preliminary_estimate, se_mult = 1)
# Step 2: Use the coefficients with best prediction performance
# to define the penalty loadings:
prelim_coefs <- coef(preliminary_estimate, lambda = 'min')
pen_loadings <- 1 / abs(prelim_coefs[-1])
# Step 3: Compute the adaptive PENSE estimates and estimate
# their prediction performance.
set.seed(123)
ada_manual <- pense_cv(x, freeny$y, alpha = 0.5,
cv_repl = 2, cv_k = 4,
penalty_loadings = pen_loadings)
# Visualize the prediction performance and coefficient path of
# the adaptive PENSE estimates (manual vs. automatic)
def.par <- par(no.readonly = TRUE)
layout(matrix(1:4, ncol = 2, byrow = TRUE))
plot(ada_convenience$preliminary)
plot(preliminary_estimate)
plot(ada_convenience)
plot(ada_manual)
par(def.par)