Ulises Ulises - 25 days ago 6
R Question

How to keep rows with the same values in two variables in r?

I have a dataset with several variables, but I want to keep the rows that are the same based on two columns. Here is an example of what I want to do:

a <- c(rep('A',3), rep('B', 3), rep('C',3))
b <- c(1,1,2,4,4,4,5,5,5)
df <- data.frame(a,b)

a b
1 A 1
2 A 1
3 A 2
4 B 4
5 B 4
6 B 4
7 C 5
8 C 5
9 C 5


I know that if I use the duplicated function I can get:

df[!duplicated(df),]

a b
1 A 1
3 A 2
4 B 4
7 C 5


But since the level 'A' on column
a
does not have a unique value in
b
, I want to drop both observations to get a new data.frame as this:

a b
4 B 4
7 C 5


Is there a way to do this? Thanks!

989 989
Answer

This one maybe?

ag <- aggregate(b~a, df, unique)
ag[lengths(ag$b)==1,]

#  a b
#2 B 4
#3 C 5