Implements lasso and ridge regression for dichotomised outcomes. Such outcomes are not naturally but artificially binary. They indicate whether an underlying measurement is greater than a threshold.

cornet(
  y,
  cutoff,
  X,
  alpha = 1,
  npi = 101,
  pi = NULL,
  nsigma = 99,
  sigma = NULL,
  nfolds = 10,
  foldid = NULL,
  type.measure = "deviance",
  ...
)

Arguments

y

continuous outcome: vector of length \(n\)

cutoff

cut-off point for dichotomising outcome into classes: meaningful value between min(y) and max(y)

X

features: numeric matrix with \(n\) rows (samples) and \(p\) columns (variables)

alpha

elastic net mixing parameter: numeric between \(0\) (ridge) and \(1\) (lasso)

npi

number of pi values (weighting)

pi

pi sequence: vector of increasing values in the unit interval; or NULL (default sequence)

nsigma

number of sigma values (scaling)

sigma

sigma sequence: vector of increasing positive values; or NULL (default sequence)

nfolds

number of folds: integer between \(3\) and \(n\)

foldid

fold identifiers: vector with entries between \(1\) and nfolds; or NULL (balance)

type.measure

loss function for binary classification: character "deviance", "mse", "mae", or "class" (see cv.glmnet)

...

further arguments passed to glmnet

Value

Returns an object of class cornet, a list with multiple slots:

  • gaussian: fitted linear model, class glmnet

  • binomial: fitted logistic model, class glmnet

  • sigma: scaling parameters sigma, vector of length nsigma

  • pi: weighting parameters pi, vector of length npi

  • cvm: evaluation loss, matrix with nsigma rows and npi columns

  • sigma.min: optimal scaling parameter, positive scalar

  • pi.min: optimal weighting parameter, scalar in unit interval

  • cutoff: threshold for dichotomisation

Details

The argument family is unavailable, because this function fits a gaussian model for the numeric response, and a binomial model for the binary response.

Linear regression uses the loss function "deviance" (or "mse"), but the loss is incomparable between linear and logistic regression.

The loss function "auc" is unavailable for internal cross-validation. If at all, use "auc" for external cross-validation only.

References

Armin Rauschenberger and Enrico Glaab (2024). "Predicting dichotomised outcomes from high-dimensional data in biomedicine". Journal of Applied Statistics 51(9):1756-1771. doi:10.1080/02664763.2023.2233057 . (Click here to access PDF. Contact: armin.rauschenberger@uni.lu.)

See also

Methods for objects of class cornet include coef and predict.

Examples

n <- 100; p <- 200
y <- rnorm(n)
X <- matrix(rnorm(n*p),nrow=n,ncol=p)
net <- cornet(y=y,cutoff=0,X=X)
net
#> cornet object:
#> n = 100, p = 200 
#> z = I(y > 0): 46+ vs 54- 
#> sigma.min = 0.8 
#> pi.min = 0.69 
#> deviance = 1.3