This function fits a semi-supervised mixture model. It simultaneously estimates two mixture components, and assigns the unlabelled observations to these.
mixtura(y, z, dist = "norm", phi = NULL, pi = NULL, gamma = NULL, test = NULL, iter = 100, kind = 0.05, debug = TRUE, ...)
y | observations:
numeric vector of length |
---|---|
z | class labels:
integer vector of length |
dist | distributional assumption:
character |
phi | dispersion parameters:
numeric vector of length |
pi | zero-inflation parameter(s):
numeric vector of length |
gamma | offset:
numeric vector of length |
test | resampling procedure:
character |
iter | (maximum) number of resampling iterations :
positive integer, or |
kind | resampling accuracy:
numeric between |
debug | verification of arguments:
|
... | settings |
This function fits and compares a one-component (H0
)
and a two-component (H1
) mixture model.
probability of belonging to class 1:
numeric vector of length n
path of the log-likelihood:
numeric vector with maximum length
it.em
parameter estimates under H0
:
data frame
parameter estimates under H1
:
data frame
log-likelihood under H0
:
numeric
log-likelihood under H1
:
numeric
likelihood-ratio test statistic: positive numeric
H0
versus H1
:
numeric between 0
and 1
, or NULL
By default, phi
and pi
are estimated by the maximum likelihood method,
and gamma
is replaced by a vector of ones.
A Rauschenberger, RX Menezes, MA van de Wiel, NM van Schoor, and MA Jonker (2020). "Semi-supervised mixture test for detecting markers associated with a quantitative trait", Manuscript in preparation.
# data simulation n <- 100 z <- rep(0:1,each=n/2) y <- rnorm(n=n,mean=2,sd=1) z[(n/4):n] <- NA # model fitting mixtura(y,z,dist="norm",test="perm")#> $posterior #> [1] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 #> [8] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 #> [15] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 #> [22] 0.0000000 0.0000000 0.0000000 0.5847103 0.5124022 0.6710150 0.7145550 #> [29] 0.6353941 0.5111437 0.5245252 0.6564112 0.5513211 0.6156385 0.7259136 #> [36] 0.5135151 0.6892393 0.5938403 0.3489121 0.5357646 0.5921529 0.5082185 #> [43] 0.3033514 0.4582151 0.4585708 0.5627931 0.5788645 0.3543564 0.5564960 #> [50] 0.6498360 0.5270052 0.4829803 0.5348777 0.5795999 0.5069320 0.6916956 #> [57] 0.7726151 0.4988517 0.5904737 0.5509167 0.6037588 0.5191042 0.5314533 #> [64] 0.5233184 0.6672887 0.6109655 0.5894274 0.6037971 0.5958933 0.5769878 #> [71] 0.4481745 0.4886986 0.4535777 0.6025104 0.6098765 0.4845336 0.6961400 #> [78] 0.4827533 0.4660534 0.4627152 0.5541518 0.6367138 0.5137884 0.3828986 #> [85] 0.4508566 0.5926182 0.6911656 0.7089736 0.5254296 0.4207828 0.5654328 #> [92] 0.6932749 0.5066595 0.4976020 0.6301265 0.4756116 0.6041247 0.5584574 #> [99] 0.5494746 0.5086801 #> #> $converge #> [1] -135.2916 -134.9897 -134.8180 -134.7161 -134.6553 -134.6191 -134.5977 #> [8] -134.5850 -134.5776 -134.5731 -134.5704 -134.5688 -134.5678 -134.5672 #> [15] -134.5669 -134.5666 -134.5665 -134.5664 #> #> $estim0 #> p0 mean0 sd0 p1 mean1 sd1 #> 1 1 1.918567 0.933193 0 NaN NaN #> #> $estim1 #> p0 mean0 sd0 p1 mean1 sd1 #> 1 0.4445477 2.064563 0.9185471 0.5554523 1.718718 0.9173225 #> #> $loglik0 #> [1] -134.9813 #> #> $loglik1 #> [1] -134.5664 #> #> $lrts #> [1] 0.8297895 #> #> $p.value #> [1] 0.6666667 #>