MFR - 4 months ago 26

R Question

I have the following data

`path value`

1 b,b,a,c 3

2 c,b 2

3 a 10

4 b,c,a,b 0

5 e,f 0

6 a,f 1

`df <- data.frame (path= c("b,b,a,c", "c,b", "a", "b,c,a,b" ,"e,f" ,"a,f"), value = c(3,2,10,0,0,1))`

I wish to compute the total number that I

`#desiored output`

path value

1: b 2

2: a 1

3: c 2

4: e 4

5: f 3

For instance, for

`a`

`a`

`a`

I tried the following code but the out put for

`b`

`total <- sum(df$value != 0)`

library (splitstackshape)

#total number of total minus total number that a value is not zero

output <-cSplit(df, "path", ",", 'long')[, .(value=total - sum(value!=0)), .(path)]

output

This code results in the following output which is not correct for

`b`

`path value`

1: b 1

2: a 1

3: c 2

4: e 4

5: f 3

Answer

Read the factors into `facs`

and then use grep them out and count:

```
facs <- unique(scan(textConnection(as.character(df$path)), what = "", sep = ","))
data.frame(path = facs,
value = colSums( !sapply(facs, grepl, as.character(df$path)) & df$value != 0 ))
```

giving:

```
path value
b b 2
a a 1
c c 2
e e 4
f f 3
```