newintern newintern - 1 month ago 5
R Question

R: mix() in mixdist package returning error

I have installed the mixdist package in R to combine distributions. Specifically, I'm using the

mix()
function. See documentation.
Basically, I'm getting




Error in nlm(mixlike, lmixdat = mixdat, lmixpar = fitpar, ldist = dist, :
missing value in parameter


I googled the error message, but no useful results popped up.

My first argument to
mix()
is a data frame called data.df. It is formatted exactly like the built-in data set pike65. I also did
data.df <- as.mixdata(data.df)
.

My second argument has two rows. It is a data frame called datapar, formatted exactly like pikepar. My
pi
values are 0.5 and 0.5. My
mu
values are 250 and 463 (based on my data set). My
sigma
values are 0.5 and 1.

My call to
mix()
looks like:


fitdata <- mix(data.df, datapar, "norm", constr = mixconstr(consigma="CCV"), emsteps = 3, print.level = 2)


The printing shows that my
pi
values go from 0.5 to NaN after the first iteration, and that my gradient is becoming 0.

I would appreciate any help in sorting out this error.




Thanks,


n.i.

Answer

Using the test data you linked to

library(mixdist) 
time <- seq(673,723) 
counts <-c(3,12,8,12,18,24,39,48,64,88,101,132,198,253,331,
   419,563,781,1134,1423,1842,2505,374,6099,9343,13009, 
   15097,13712,9969,6785,4742,3626,3794,4737,5494,5656,4806,
   3474,2165,1290,799,431,213,137,66,57,41,35,27,27,27) 
data.df <- data.frame(time=time, counts=counts) 

We can see that

startparam <- mixparam(c(699,707),1 )
data.fit <- mix(data.mix, startparam, "norm") 

Gives the same error. This error appears to be closely tied to the data (so the reason this data does not work could be potentially different than why yours does not work but this is the only example you offered up).

The problem with this data is that the probability between the two groups becomes indistinguishable at some point. Then that happens, the "E" step of the algorithm cannot estimate the pi variable properly. Here

pnorm(717,707,1)
# [1] 1
pnorm(717,699,1)
# [1] 1

both are exactly 1 and this seems to be causing the error. When mix takes 1 minus this value and compares the ratio to estimate group, it gets NaN values which are propagated to the estimate of proportions. When internally these NaN values are passed to nlm() to do the estimation, you get the error message

Error in nlm(mixlike, lmixdat = mixdat, lmixpar = fitpar, ldist = dist,  : 
  missing value in parameter

The same error message can be replicated with

f <- function(x) sum((x-1:length(x))^2)
nlm(f, c(10,10))
nlm(f, c(10,NaN)) #error

So it appears the maxdist package will not work in this scenario. You may wish to contact the package maintainer to see if they are aware of the problem. In the meantime you will will need to find another way to estimate the parameters of you mixture model.