user181187 user181187 - 1 month ago 5
R Question

data.frame - remove rows with same observations in two out of three columns

I have got a dataframe and I would like to remove the rows which have the same observations in 2 out of 3 columns (i.e. the columns

Date
and
Height
below)

Here an example of my dataframe:

df <- read.table(text ="Code Date Height
2001 1974 1974
2001 01 Apr 1975 120.209
2001 06 Jan 1976 158.699
3002 1973 1973
3002 18 Jan 1974 159.753
3002 13 Dec 1974 132.125
4003 1973 1973
4003 18 Jan 1974 57.211
4003 19 Dec 1974 65.279", header = TRUE)


I think I can write a function for removing the rows with the same observations in the columns Date and Height, in order to obtain the following output:

Code Date Height
2001 01 Apr 1975 120.209
2001 06 Jan 1976 158.699
3002 18 Jan 1974 159.753
3002 13 Dec 1974 132.125
4003 18 Jan 1974 57.211
4003 19 Dec 1974 65.279


I tried successfully this code, which create a subset of the rows that I want to remove:


subset <- df[ which(df$Date == df$Height), ]


How can I remove the rows contained in
subset
from
df
?

Any suggestion will be highly appreciated.

Answer

We can also use filter from dplyr

 library(dplyr)
 df1 %>% 
     filter(Date != Height) 
Comments