Combined regression

Implements lasso and ridge regression for dichotomised outcomes. Such outcomes are not naturally but artificially binary. They indicate whether an underlying measurement is greater than a threshold.

cornet(
  y,
  cutoff,
  X,
  alpha = 1,
  npi = 101,
  pi = NULL,
  nsigma = 99,
  sigma = NULL,
  nfolds = 10,
  foldid = NULL,
  type.measure = "deviance",
  ...
)

Arguments

y: continuous outcome: vector of length \(n\)
cutoff: cut-off point for dichotomising outcome into classes: meaningful value between min(y) and max(y)
X: features: numeric matrix with \(n\) rows (samples) and \(p\) columns (variables)
alpha: elastic net mixing parameter: numeric between \(0\) (ridge) and \(1\) (lasso)
npi: number of pi values (weighting)
pi: pi sequence: vector of increasing values in the unit interval; or NULL (default sequence)
nsigma: number of sigma values (scaling)
sigma: sigma sequence: vector of increasing positive values; or NULL (default sequence)
nfolds: number of folds: integer between \(3\) and \(n\)
foldid: fold identifiers: vector with entries between \(1\) and nfolds; or NULL (balance)
type.measure: loss function for binary classification: character "deviance", "mse", "mae", or "class" (see cv.glmnet)
...: further arguments passed to glmnet

Value

Returns an object of class cornet, a list with multiple slots:

gaussian: fitted linear model, class glmnet
binomial: fitted logistic model, class glmnet
sigma: scaling parameters sigma, vector of length nsigma
pi: weighting parameters pi, vector of length npi
cvm: evaluation loss, matrix with nsigma rows and npi columns
sigma.min: optimal scaling parameter, positive scalar
pi.min: optimal weighting parameter, scalar in unit interval
cutoff: threshold for dichotomisation

Details

The argument family is unavailable, because this function fits a gaussian model for the numeric response, and a binomial model for the binary response.

Linear regression uses the loss function "deviance" (or "mse"), but the loss is incomparable between linear and logistic regression.

The loss function "auc" is unavailable for internal cross-validation. If at all, use "auc" for external cross-validation only.

References

Armin Rauschenberger and Enrico Glaab (2024). "Predicting dichotomised outcomes from high-dimensional data in biomedicine". Journal of Applied Statistics 51(9):1756-1771. doi:10.1080/02664763.2023.2233057 . (Click here to access PDF. Contact: armin.rauschenberger@uni.lu.)

Examples

n <- 100; p <- 200
y <- rnorm(n)
X <- matrix(rnorm(n*p),nrow=n,ncol=p)
net <- cornet(y=y,cutoff=0,X=X)
net
#> cornet object:
#> n = 100, p = 200 
#> z = I(y > 0): 46+ vs 54- 
#> sigma.min = 0.8 
#> pi.min = 0.69 
#> deviance = 1.3

Arguments

Value

Details

References

See also

Examples