Lloyd Christmas - 8 months ago 35

R Question

I'm having a hard time to describe this so it's best explained with an example (as can probably be seen from the poor question title).

Using dplyr I have the result of a

`group_by`

`summarize`

As an example, here's a data frame that looks like the result of my dplyr operations:

`> df <- data.frame(run=as.factor(c(rep(1,3), rep(2,3))),`

group=as.factor(rep(c("a","b","c"),2)),

sum=c(1,8,34,2,7,33))

> df

run group sum

1 1 a 1

2 1 b 8

3 1 c 34

4 2 a 2

5 2 b 7

6 2 c 33

I want to divide

`sum`

`run`

`> total <- data.frame(run=as.factor(c(1,2)),`

total=c(45,47))

> total

run total

1 1 45

2 2 47

Then my final data frame will look like this:

`> df`

run group sum percent

1 1 a 1 1/45

2 1 b 8 8/45

3 1 c 34 34/45

4 2 a 2 2/47

5 2 b 7 7/47

6 2 c 33 33/47

Where I manually inserted the fraction in the

`percent`

I know there is probably some dplyr way to do this with

`mutate`

Answer

(In base R)

You can use `total`

as a look-up table where you get a total for each run of `df`

:

```
total[df$run,'total']
[1] 45 45 45 47 47 47
```

And you simply use it to divide the sum and assign the result to a new column:

```
df$percent <- df$sum / total[df$run,'total']
run group sum percent
1 1 a 1 0.02222222
2 1 b 8 0.17777778
3 1 c 34 0.75555556
4 2 a 2 0.04255319
5 2 b 7 0.14893617
6 2 c 33 0.70212766
```