user3702510 - 1 year ago 117
R Question

# ggplot mixture model R

I have a dataset with numeric values and a categorical variable. The distribution of the numeric variable differs for each category. I want to plot "density plots" for each categorical variable so that they are visually below the entire density plot.

This is similiar to components of a mixture model without calculating the mixture model (as I already know the categorical variable which splits the data).

If I take ggplot to group according to the categorical variable, each of the four densities are real densities and integrate to one.

``````library(ggplot2)
ggplot(iris, aes(x = Sepal.Width)) + geom_density() + geom_density(aes(x = Sepal.Width, group = Species, colour = 'Species'))
``````

What I want is to have the densities of each category as a sub-density (not integrating to 1). Similiar to the following code (which I only implemented for two of the three iris species)

``````myIris <- as.data.table(iris)
# calculate density for entire dataset
dens_entire <- density(myIris[, Sepal.Width], cut = 0)
dens_e <- data.table(x = dens_entire[[1]], y = dens_entire[[2]])

# calculate density for dataset with setosa
dens_setosa <- density(myIris[Species == 'setosa', Sepal.Width], cut = 0)
dens_sa <- data.table(x = dens_setosa[[1]], y = dens_setosa[[2]])

# calculate density for dataset with versicolor
dens_versicolor <- density(myIris[Species == 'versicolor', Sepal.Width], cut = 0)
dens_v <- data.table(x = dens_versicolor[[1]], y = dens_versicolor[[2]])

# plot densities as mixture model
ggplot(dens_e, aes(x=x, y=y)) + geom_line() + geom_line(data = dens_sa, aes(x = x, y = y/2.5, colour = 'setosa')) +
geom_line(data = dens_v, aes(x = x, y = y/1.65, colour = 'versicolor'))
``````

resulting in

Above I hard-coded the number to reduce the y values. Is there any way to do it with ggplot? Or to calculate it?

Do you mean something like this? You need to change the scale though.

``````ggplot(iris, aes(x = Sepal.Width)) +
geom_density(aes(y = ..count..)) +
geom_density(aes(x = Sepal.Width, y = ..count..,
group = Species, colour = Species))
``````

Another option may be

``````ggplot(iris, aes(x = Sepal.Width)) +
geom_density(aes(y = ..density..)) +
geom_density(aes(x = Sepal.Width, y = ..density../3,
group = Species, colour = Species))
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download