DANIEL OTERO ROBLES DANIEL OTERO ROBLES - 2 months ago 6
R Question

Find identical observations in a column´s data frame but different in another column

In R, I have a data frame which includes a ID column. I need to find all the rows that have the same ID but are different in the X1 variable.

For example,

d

ID X1 X2
a 19 F
b 19 F
c 16 T
a 16 T
a 19 T
d 17 T
b 15 F
b 19 F
c 17 T
c 17 T
d 17 T
e 15 T
f 14 T
g 16 T


The result will be:

df1

ID X1 X2
a 19 F
b 19 F
c 16 T
a 16 T
b 15 F
c 17 T

Answer
t      <- table(d$X1, d$ID)
t[t>1] <- 1
t      <- apply(t,2,sum)
t      <- t[t>1]

d1 <- data.frame(ID = names(t))
d1 <- merge(d1, d, by = "ID", all.x=T,all.y=F)
d1 <- unique(d1[,1:2])
d1
  ID X1
1  a 19
2  a 16
4  b 15
5  b 19
7  c 16
8  c 17

We can include the 3rd column as well, but you'd need to give some logic to pick which value of it to retain. For instance, there were 2 values of a where X1 was 19, one with X2 T and one where it was F. To choose between the 2 you could keep the first matching row for X2, the last, or choose T above F, etc.

Comments