Lloyd Christmas - 1 month ago 5x
R Question

# perform operations on a data frame based on a factors

I'm having a hard time to describe this so it's best explained with an example (as can probably be seen from the poor question title).

Using dplyr I have the result of a

`group_by`
and
`summarize`
I have a data frame that I want to do some further manipulation on by factor.

As an example, here's a data frame that looks like the result of my dplyr operations:

``````> df <- data.frame(run=as.factor(c(rep(1,3), rep(2,3))),
group=as.factor(rep(c("a","b","c"),2)),
sum=c(1,8,34,2,7,33))
> df
run group sum
1   1     a   1
2   1     b   8
3   1     c  34
4   2     a   2
5   2     b   7
6   2     c  33
``````

I want to divide
`sum`
by a value that depends on
`run`
. For example, if I have:

``````> total <- data.frame(run=as.factor(c(1,2)),
total=c(45,47))
> total
run total
1   1    45
2   2    47
``````

Then my final data frame will look like this:

``````> df
run group sum percent
1   1     a   1    1/45
2   1     b   8    8/45
3   1     c  34   34/45
4   2     a   2    2/47
5   2     b   7    7/47
6   2     c  33   33/47
``````

Where I manually inserted the fraction in the
`percent`
column by hand to show the operation I want to do.

I know there is probably some dplyr way to do this with
`mutate`
but I can't seem to figure it out right now. How would this be accomplished?

(In base R)

You can use `total` as a look-up table where you get a total for each run of `df` :

``````total[df\$run,'total']
[1] 45 45 45 47 47 47
``````

And you simply use it to divide the sum and assign the result to a new column:

``````df\$percent <- df\$sum / total[df\$run,'total']

run group sum    percent
1   1     a   1 0.02222222
2   1     b   8 0.17777778
3   1     c  34 0.75555556
4   2     a   2 0.04255319
5   2     b   7 0.14893617
6   2     c  33 0.70212766
``````