Nate Reed Nate Reed - 1 year ago 41
R Question

How to select rows in data frame not in the list of indexes?

I want to select the rows in a data frame whose indexes are not in a list of rows, eg:

split = 0.70
train_subset <- df[sample(nrow(df),
size=split * nrow(df)),]
test_subset = ?

How can I create test_subset from df and train_subset?

Answer Source
split <- 0.70
train_rows   <- sample(nrow(df), size = split * nrow(df))

train_subset <- df[train_rows,]

test_subset  <- df[-train_rows,]

Store the rows you want to sample in a vector. Then use that to select training and testing sets.