Jesse001 Jesse001 - 18 days ago 4
R Question

How do I automate finding the coefficient of variation for multiple categories?

In my data I have 1000 measures for each spatial unit and would like to plot the coefficient of variation of each of these units. I know how to calculate the coefficient of variation for the entire data set, but how would I:

1) Create a function that will grab all category names (unique values in a column).

2) Apply the CV function to only those data in each category

3) Output the results so they can be plotted as x=category and y=CV

The Iris data set can be used as an example. Lets say I'd like to know the coefficient of variation of petal length for each species. The CV itself is simple enough, but I'm at loss for the rest of it.

data(iris)
CV<-function(mean,sd){
(sd/mean)*100
}
IrisCV<-CV(mean=mean(iris$Petal.Length), sd=sd(iris$Petal.Length))
IrisCV


Any help is much appreciated!

Answer

First, you should change your function to:

CV <- function(x){
        (sd(x)/mean(x))*100
}

Then you can use aggregate():

aggregate(Petal.Length ~ Species, 
          data = iris,
          FUN = CV)
#     Species Petal.Length
#1     setosa    11.878522
#2 versicolor    11.030774
#3  virginica     9.940466