DANIEL OTERO ROBLES - 2 months ago 6
R Question

Find identical observations in a column´s data frame but different in another column

In R, I have a data frame which includes a ID column. I need to find all the rows that have the same ID but are different in the X1 variable.

For example,

``````d

ID    X1     X2
a    19      F
b    19      F
c    16      T
a    16      T
a    19      T
d    17      T
b    15      F
b    19      F
c    17      T
c    17      T
d    17      T
e    15      T
f    14      T
g    16      T
``````

The result will be:

``````df1

ID    X1     X2
a    19      F
b    19      F
c    16      T
a    16      T
b    15      F
c    17      T
``````

``````t      <- table(d\$X1, d\$ID)
t[t>1] <- 1
t      <- apply(t,2,sum)
t      <- t[t>1]

d1 <- data.frame(ID = names(t))
d1 <- merge(d1, d, by = "ID", all.x=T,all.y=F)
d1 <- unique(d1[,1:2])
d1
``````
``````  ID X1
1  a 19
2  a 16
4  b 15
5  b 19
7  c 16
8  c 17
``````

We can include the 3rd column as well, but you'd need to give some logic to pick which value of it to retain. For instance, there were 2 values of `a` where `X1` was 19, one with `X2` T and one where it was F. To choose between the 2 you could keep the first matching row for `X2`, the last, or choose T above F, etc.