Mehdi Farhangian - 1 year ago 110

R Question

I am going to count if an event occurred and if it is occurred it had any consequence or not. Let's assume this is my data

`#mydata`

a b c d consequence

0 0 1 1 0

1 0 1 1 1

1 1 1 0 0

0 0 0 1 0

So, for each variable I calculate how many times a variable occurred and how many times this variable caused a consequence:an example for "a"

`numberofa=length (subset(mydata, mydata$a==1))`

numberofaeffective= Length (subset(mydata, mydata$a==1 $ mydata$consequence=1))

How can I write a program to calculate these two metrics for each variable?

`#expected output`

variable count count-with-effect

a 2 1

b 1 0

c 3 1

d 3 1

Answer Source

We can do this with `sum`

of logical vector

```
sum(dts$a==1)
#[1] 2
```

and

```
with(dts, sum(a==1 & consequence == 1))
#[1] 1
```

If we need it for each of the variables (i.e. 'a' to 'd')

```
colSums(dts[1:4] == 1)
# a b c d
# 2 1 3 3
```

and for the second with 'consequence'

```
colSums(dts[1:4] == 1 & (dts[5] == 1)[row(dts[1:4])])
#a b c d
#1 0 1 1
```

If we need it in a specific format, we can `gather`

the dataset into 'long' format, then do the group by operation and `summarise`

by `sum`

ming the 'value' column

```
library(dplyr)
library(tidyr)
gather(dts, variable, value, -consequence) %>%
group_by(variable) %>%
summarise(count = sum(value), count_with_effect = sum(value & consequence))
# variable count count_with_effect
# <chr> <int> <int>
#1 a 2 1
#2 b 1 0
#3 c 3 1
#4 d 3 1
```