Implements lasso and ridge regression for dichotomised outcomes. Such outcomes are not naturally but artificially binary. They indicate whether an underlying measurement is greater than a threshold.
cornet(
y,
cutoff,
X,
alpha = 1,
npi = 101,
pi = NULL,
nsigma = 99,
sigma = NULL,
nfolds = 10,
foldid = NULL,
type.measure = "deviance",
...
)
continuous outcome: vector of length \(n\)
cut-off point for dichotomising outcome into classes:
meaningful value between min(y)
and max(y)
features: numeric matrix with \(n\) rows (samples) and \(p\) columns (variables)
elastic net mixing parameter: numeric between \(0\) (ridge) and \(1\) (lasso)
number of pi
values (weighting)
pi sequence:
vector of increasing values in the unit interval;
or NULL
(default sequence)
number of sigma
values (scaling)
sigma sequence:
vector of increasing positive values;
or NULL
(default sequence)
number of folds: integer between \(3\) and \(n\)
fold identifiers:
vector with entries between \(1\) and nfolds
;
or NULL
(balance)
loss function for binary classification:
character "deviance"
, "mse"
, "mae"
,
or "class"
(see cv.glmnet
)
further arguments passed to glmnet
Returns an object of class cornet
, a list with multiple slots:
gaussian
: fitted linear model, class glmnet
binomial
: fitted logistic model, class glmnet
sigma
: scaling parameters sigma
,
vector of length nsigma
pi
: weighting parameters pi
,
vector of length npi
cvm
: evaluation loss,
matrix with nsigma
rows and npi
columns
sigma.min
: optimal scaling parameter,
positive scalar
pi.min
: optimal weighting parameter,
scalar in unit interval
cutoff
: threshold for dichotomisation
The argument family
is unavailable, because
this function fits a gaussian model for the numeric response,
and a binomial model for the binary response.
Linear regression uses the loss function "deviance"
(or "mse"
),
but the loss is incomparable between linear and logistic regression.
The loss function "auc"
is unavailable for internal cross-validation.
If at all, use "auc"
for external cross-validation only.
Armin Rauschenberger and Enrico Glaab (2024). "Predicting dichotomised outcomes from high-dimensional data in biomedicine". Journal of Applied Statistics 51(9):1756-1771. doi:10.1080/02664763.2023.2233057 . (Click here to access PDF. Contact: armin.rauschenberger@uni.lu.)