jalapic jalapic - 1 month ago 7
R Question

controlling color of factor group in ggvis - r

I have a question about controlling the color of datapoints in ggvis.

I have a dataframe that I am filtering in multiple ways (within a shiny app in case it matters). This leads to often no observations of the group I am coloring data points by being present in the resulting filtered dataframe. This obviously results in different colors appearing in different plots which is confusing.

This is a pretty close example:

set.seed(101)
dfvis <- data.frame(x = runif(20), y = runif(20), mygroup = LETTERS[1:5])
dfvis


dfvis %>%
ggvis(x= ~x, y= ~y) %>%
layer_points(fill = ~factor(mygroup))


enter image description here

Let's filter a group out -

dfvis <- dfvis %>% filter(mygroup!="A")

dfvis %>%
ggvis(x= ~x, y= ~y) %>%
layer_points(fill = ~factor(mygroup))


enter image description here

Here, "B" is now blue and all other groups shift up one in terms of the color order.

Is there a way, when doing multiple filters on the same df, to always ensure the same color for each level of the factor/group ?

One trick that has worked in ggplot before has been to add one NA observation to the end of the dataframe for each factor level. At first glance, this works ok as the colors are back in the right order, but notice the rogue data point in the top left !

dfvis1 <- rbind(dfvis, data.frame(x=NA, y=NA, mygroup="A"))

dfvis1 %>%
ggvis(x= ~x, y= ~y) %>%
layer_points(fill = ~factor(mygroup))


enter image description here

all help appreciated.

Answer

Solution 1

It seems as if I overlooked a very easy solution:

Just re-define the levels of the factor, and drop the factor from the fill=

I'll leave this up as it might help someone else.

dfvis$mygroup<-factor(dfvis$mygroup, levels=c("A", "B", "C", "D", "E"))

dfvis %>% 
  ggvis(x= ~x, y= ~y)  %>% 
  layer_points(fill = ~mygroup)

Solution 2

This may actually have more generalizability for ggvis users. We can take advantage of : versus :=. Make a new variable of colors for each group

dfvis$color <- c("blue","orange","green","red","purple")

Then we can use the unscaled raw value of colors with fill:= inside the ggvis function...

#:= denotes unscaled raw value
dfvis %>% 
  ggvis(x= ~x, y= ~y, fill:= ~color)  %>% 
  layer_points()

This will ensure color consistency even after filtering out other groups.

Comments