Simulates data for multi-task learning and transfer learning.
Usage
sim_data_multi(
prob.common = 0.05,
prob.separate = 0.05,
q = 3,
n0 = 100,
n1 = 10000,
p = 200,
rho = 0.5,
family = "gaussian"
)
sim_data_trans(
prob.common = 0.05,
prob.separate = 0.05,
q = 3,
n0 = c(50, 100, 200),
n1 = 10000,
p = 200,
rho = 0.5,
family = "gaussian"
)
Arguments
- prob.common
probability of common effect (number between 0 and 1)
- prob.separate
probability of separate effect (number between 0 and 1)
- q
number of datasets: integer
- n0
number of training samples: integer vector of length \(q\)
- n1
number of testing samples for all datasets: integer
- p
number of features: integer
- rho
correlation (for decreasing structure)
- family
character
"gaussian"
or"binomial"
Value
Multi-task learning: Returns a list with slots
y_train
(\(n_0 \times q\) matrix),X_train
(\(n_0 \times p\) matrix),y_test
(\(n_1 \times q\) matrix),X_test
(\(n_1 \times p\) matrix), andbeta
(\(p \times q\) matrix).Transfer learning: Returns a list with slots
y_train
(\(q\) vectors) andX_train
(\(q\) matrices with \(p\) columns) for training data, andy_test
(\(vectors\)) andX_test
(\(q\) matrices with \(p\) columns) for testing data, andbeta
for effects (\(p \times q\) matrix).
Examples
#--- multi-task learning ---
data <- sim_data_multi()
sapply(X=data,FUN=dim)
#> y_train X_train y_test X_test beta
#> [1,] 100 100 10000 10000 200
#> [2,] 3 200 3 200 3
#--- transfer learning ---
data <- sim_data_trans()
sapply(X=data$y_train,FUN=length)
#> [1] 50 100 200
sapply(X=data$X_train,FUN=dim)
#> [,1] [,2] [,3]
#> [1,] 50 100 200
#> [2,] 200 200 200
sapply(X=data$y_test,FUN=length)
#> [1] 10000 10000 10000
sapply(X=data$X_test,FUN=dim)
#> [,1] [,2] [,3]
#> [1,] 10000 10000 10000
#> [2,] 200 200 200
dim(data$beta)
#> [1] 200 3