 NBC -3 years ago 82
R Question

# Complicated table calculations [R]

I have a dataframe df_workingFile:

``````group  | id  | flag | value |
a     |  a  |  0   |  5    |
a     |  c  |  0   |  2    |
b     |  b  |  0   |  8    |
b     |  e  |  1   |  1    |
``````

I want to sum the value column based on group, but only on the row where group = id (there can only be one of these instances by group). I also don't want to include any values in the sum that have a flag:

``````group  | id  | flag | value | ValueSum
a     |  a  |  0   |  5    |   7
a     |  c  |  0   |  2    |   0
b     |  b  |  0   |  8    |   8
b     |  e  |  1   |  1    |   0
``````

Input data:

``````df_workingFile <- structure(list(group = structure(c(1L, 1L, 2L, 2L), .Label = c("a",
"b"), class = "factor"), id = structure(c(1L, 3L, 2L, 4L), .Label = c("a",
"b", "c", "e"), class = "factor"), flag = c(0, 0, 0, 1), value = c(5,
2, 8, 1)), .Names = c("group", "id", "flag", "value"), row.names = c(NA,
4L), class = "data.frame")
`````` be_green

This is quite simple with the `data.table` package. Here is my solution:

``````library(data.table)

temp <- data.table(group = c("a","a","b","b"),
id = c("a","c","b","d"),
flag = c(0,0,0,1),
value = c(5,2,8,1))

temp[flag != 1,ValueSum := ifelse(group == id,sum(value), NA), by = group]
``````

Before the first comma, you are able to specify a logical condition. In the middle you can do column assignment (I'm sure there's a cleaner solution in this case, but it works). After the second comma, you can specify a group-by statement. I would highly recommend getting familiar with the package, it has made my life far easier.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download