user2380782 user2380782 - 1 month ago 6
R Question

subsetting a data.frame using a for loop

I have a data.frame, and I want to subset it every 10 rows and then applied a function to the subset, save the object, and remove the previous object. Here is what I got so far

L3 <- LETTERS[1:20]
df <- data.frame(1:391, "col", sample(L3, 391, replace = TRUE))
names(df) <- c("a", "b", "c")

b <- seq(from=1, to=391, by=10)
nsamp <- 0
for(i in seq_along(b)){
a <- i+1
nsamp <- nsamp+1
df_10 <- df[b[nsamp]:b[a], ]
res <- lapply(seq_along(df_10$b), function(x){...}
saveRDS(res, file="res.rds")
rm(res)
}


My problem is the
for loop
crashes when reaching the last element of my sequence
b

Answer

When partitioning data, split is your friend. It will create a list with each data subset as an item which is then easy to iterate over.

dfs = split(df, 1:nrow(df) %/% 10)

Then your for loop can be simplified to something like this (untested... I'm not exactly sure what you're doing because example data seems to switch from df to sc2_10 and I only hope your column named b is different from your vector named b):

for(i in seq_along(dfs)){
  res <- lapply(seq_along(dfs[[i]]$b), function(x){...}
  saveRDS(res, file = sprintf("res_%s.rds", i))
  rm(res)
}

I also modified your save file name so that you aren't overwriting the same file every time.

Comments