Estimates sparse regression models (i.e., performing feature selection) in multi-task learning or transfer learning. Multi-task learning involves multiple targets, and transfer learning involves multiple datasets.
Usage
sparselink(
x,
y,
family,
alpha.init = 0.95,
alpha = 1,
type = "exp",
nfolds = 10,
cands = NULL
)
Arguments
- x
\(n \times p\) matrix (multi-task learning) or list of \(n_k \times p\) matrices (transfer learning)
- y
\(n \times q\) matrix (multi-task learning) or list of \(n_k\)-dimensional vectors (transfer learning)
- family
character
"gaussian"
or"binomial"
- alpha.init
elastic net mixing parameter for initial regressions, default: 0.95 (lasso-like elastic net)
- alpha
elastic net mixing parameter of final regressions, default: 1 (lasso)
- type
default
"exp"
scales weights with \(w_{ext}^{v_{ext}}+w_{int}^{v_{int}}\) (see internal function construct_penfacs for details)- nfolds
number of internal cross-validation folds, default: 10 (10-fold cross-validation)
- cands
candidate values for both scaling parameters, default:
NULL
({0, 0.2, 0.4, 0.6, 0.8, 1})
Value
Returns an object of class sparselink
, a list with multiple slots:
Stage 1 regressions (before sharing information): Slot
glm.one
contains \(q\) objects of typecv.glmnet
(one for each problem).Candidate scaling parameters (exponents): Slot
weight
contains a data frame with \(n\) combinations of exponents for the external (source) and internal (target) weightsStage 2 regressions (after sharing information): Slot
glm.two
contains \(q\) lists (one for each problem) of \(n\) objects of typecv.glmnet
(one for each combination of exponents).Optimal regularisation parameters: Slot
lambda.min
contains the cross-validated regularisation parameters for the stage 2 regressions.Optimal scaling parameters: Slots
weight.ind
andweight.min
indicate or contain the cross-validated scaling parameters.
References
Armin Rauschenberger, Petr N. Nazarov, and Enrico Glaab (2025). "Estimating sparse regression models in multi-task learning and transfer learning through adaptive penalisation". Under revision. https://hdl.handle.net/10993/63425
Examples
#--- multi-task learning ---
n <- 100
p <- 200
q <- 3
family <- "gaussian"
x <- matrix(data=rnorm(n=n*p),nrow=n,ncol=p)
y <- matrix(data=rnorm(n*q),nrow=n,ncol=q)
object <- sparselink(x=x,y=y,family=family)
#> mode: multi-target learning, alpha.init=0.95 (elastic net), alpha=1 (lasso)
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#--- transfer learning ---
n <- c(100,50)
p <- 200
x <- lapply(X=n,function(x) matrix(data=stats::rnorm(n*p),nrow=x,ncol=p))
y <- lapply(X=n,function(x) stats::rnorm(x))
family <- "gaussian"
object <- sparselink(x=x,y=y,family=family)
#> mode: transfer learning, alpha.init=0.95 (elastic net), alpha=1 (lasso)
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold