Ulises - 7 months ago 37

R Question

I have a dataset with several variables, but I want to keep the rows that are the same based on two columns. Here is an example of what I want to do:

`a <- c(rep('A',3), rep('B', 3), rep('C',3))`

b <- c(1,1,2,4,4,4,5,5,5)

df <- data.frame(a,b)

a b

1 A 1

2 A 1

3 A 2

4 B 4

5 B 4

6 B 4

7 C 5

8 C 5

9 C 5

I know that if I use the duplicated function I can get:

`df[!duplicated(df),]`

a b

1 A 1

3 A 2

4 B 4

7 C 5

But since the level 'A' on column

`a`

`b`

`a b`

4 B 4

7 C 5

Is there a way to do this? Thanks!

Answer

This one maybe?

```
ag <- aggregate(b~a, df, unique)
ag[lengths(ag$b)==1,]
# a b
#2 B 4
#3 C 5
```

Source (Stackoverflow)