Varun Varun - 2 months ago 7
R Question

Remove rows of a sample from the original data frame

The data set that I am using has missing values in them, so I have to use Amelia package for imputations, the resulting data set is of the following form:

Bi.Rads Age Shape Margin Density Severity
5.000000 70.00000 3.4685058 5.00000000 3.000000 1
5.000000 70.00000 4.0000000 3.00000000 3.000000 1
5.000000 70.00000 4.0000000 4.00000000 3.000000 1
5.000000 70.00000 4.0000000 5.00000000 3.000000 1
5.000000 70.00000 4.2881664 4.00000000 3.689292 1
5.000000 70.27765 4.0000000 4.00000000 3.000000 1


The values in decimal are the imputed one's. Now considering this data set as a data frame df, I am randomly sampling 100 rows from df without replacement

df1<-df[sample(nrow(df),100),]


Now, I want to remove df1 from df, and I have tried every suggestion on similar posts like using %in%, used dplyr package which doesn't return 861 rows. I tried to comment on other posts but I couldn't because I don't have enough reputation. Could you please help me out? None of the techniques like using packages sqldf, compare have worked so far.

Answer

Try this:

indices <- sample(1:nrow(df), 100)
df <- df[-indices,]