Petrroll Petrroll - 1 month ago 19
R Question

Boxplot sees values that aren't there

I have a IMDB dataset and trying to make a boxplot of a film's ratings.

I've successfully loaded the dataset and tried to make the boxplot but it produced a really weird result.

It looked as it tried to make a boxplot for all the films and not just the one selected.

boxplot(rating ~ title, data=imdb[imdb$title == "Top Gun (1986)", ])


The graph produced:
enter image description here

As you can see the y axis looks as if it contained films that aren't in the filtered dataset at all (I selected those via title).

Answer

Factors retain their levels even after subsetting, you can drop those that are unused with droplevels:

boxplot(rating ~ title, data=droplevels(imdb[imdb$title == "Top Gun (1986)", ]))
Comments