Installation
Install the current release from CRAN:
install.packages("joinet")
Or install the latest development version from GitHub:
#install.packages("devtools")
devtools::install_github("rauschenberger/joinet")
Initialisation
Load and attach the package:
And access the documentation:
?joinet
help(joinet)
browseVignettes("joinet")
Simulation
For n
samples, we simulate p
inputs
(features, covariates) and q
outputs (outcomes, responses).
We assume high-dimensional inputs (p
n
) and low-dimensional outputs (q
n
).
n <- 100
q <- 2
p <- 500
We simulate the p
inputs from a multivariate normal
distribution. For the mean, we use the p
-dimensional vector
mu
, where all elements equal zero. For the covariance, we
use the p
p
matrix Sigma
, where the entry in row
and column
equals
rho
.
The parameter rho
determines the strength of the
correlation among the inputs, with small rho
leading weak
correlations, and large rho
leading to strong correlations
(0 < rho
< 1). The input matrix X
has
n
rows and p
columns.
mu <- rep(0,times=p)
rho <- 0.90
Sigma <- rho^abs(col(diag(p))-row(diag(p)))
X <- MASS::mvrnorm(n=n,mu=mu,Sigma=Sigma)
We simulate the input-output effects from independent Bernoulli
distributions. The parameter pi
determines the number of
effects, with small pi
leading to few effects, and large
pi
leading to many effects (0 < pi
< 1).
The scalar alpha
represents the intercept, and the
p
-dimensional vector beta
represents the
slopes.
pi <- 0.01
alpha <- 0
beta <- rbinom(n=p,size=1,prob=pi)
From the intercept alpha
, the slopes beta
and the inputs X
, we calculate the linear predictor, the
n
-dimensional vector eta
. Rescale the linear
predictor to make the effects weaker or stronger. Set the argument
family
to "gaussian"
, "binomial"
,
or "poisson"
to define the distribution. The n
times p
matrix Y
represents the outputs. We
assume the outcomes are positively correlated.
eta <- alpha + X %*% beta
eta <- 1.5*scale(eta)
family <- "gaussian"
if(family=="gaussian"){
mean <- eta
Y <- replicate(n=q,expr=rnorm(n=n,mean=mean))
}
if(family=="binomial"){
prob <- 1/(1+exp(-eta))
Y <- replicate(n=q,expr=rbinom(n=n,size=1,prob=prob))
}
if(family=="poisson"){
lambda <- exp(eta)
Y <- replicate(n=q,expr=rpois(n=n,lambda=lambda))
}
cor(Y)
Application
The function joinet
fits univariate and multivariate
regression. Set the argument alpha.base
to 0 (ridge) or 1
(lasso).
object <- joinet(Y=Y,X=X,family=family)
Standard methods are available. The function predict
returns the predicted values from the univariate (base
) and
multivariate (meta
) models. The function coef
returns the estimated intercepts (alpha
) and slopes
(beta
) from the multivariate model (input-output effects).
And the function weights
returns the weights from stacking
(output-output effects).
The function cv.joinet
compares the predictive
performance of univariate (base
) and multivariate
(meta
) regression by nested cross-validation. The argument
type.measure
determines the loss function.
cv.joinet(Y=Y,X=X,family=family)
## [,1] [,2]
## base 1.196388 1.606624
## meta 1.170356 1.262356
## none 3.249320 3.496644
Reference
Armin Rauschenberger and Enrico Glaab (2021). “Predicting correlated outcomes from molecular data”. Bioinformatics 37(21):3889–3895. doi: 10.1093/bioinformatics/btab576. (Click here to access PDF.)