DANIEL OTERO ROBLES - 8 months ago 33

R Question

In R, I have a data frame which includes a ID column. I need to find all the rows that have the same ID but are different in the X1 variable.

For example,

`d`

ID X1 X2

a 19 F

b 19 F

c 16 T

a 16 T

a 19 T

d 17 T

b 15 F

b 19 F

c 17 T

c 17 T

d 17 T

e 15 T

f 14 T

g 16 T

The result will be:

`df1`

ID X1 X2

a 19 F

b 19 F

c 16 T

a 16 T

b 15 F

c 17 T

Answer

```
t <- table(d$X1, d$ID)
t[t>1] <- 1
t <- apply(t,2,sum)
t <- t[t>1]
d1 <- data.frame(ID = names(t))
d1 <- merge(d1, d, by = "ID", all.x=T,all.y=F)
d1 <- unique(d1[,1:2])
d1
```

`ID X1 1 a 19 2 a 16 4 b 15 5 b 19 7 c 16 8 c 17`

We can include the 3rd column as well, but you'd need to give some logic to pick which value of it to retain. For instance, there were 2 values of `a`

where `X1`

was 19, one with `X2`

T and one where it was F. To choose between the 2 you could keep the first matching row for `X2`

, the last, or choose T above F, etc.

Source (Stackoverflow)