Oli Paul Oli Paul - 2 months ago 16
R Question

R lapply using stringi and rbind

I'd like to split out some data within a data frame by a specific string and count the frequency.

After toying with a few methods I've come up with a method, but there's a slight error in my results.

Example:

Data frame data file:

data
abc hello
hello
aaa
zxy
xyz


List:

list
abc
bcd
efg
aaa


My code:

lapply(list$list, function(x){
t <- data.frame(words = stri_extract(df$data, coll=x))
t<- setDT(t)[, .( Count = .N), by = words]
t<-t[complete.cases(t$words)]
result<-rbind(result,t)
write.csv(result, "new.csv", row.names = F)
})


In this example I would expect a CSV file with the following results:

words Count
abc 1
aaa 1


However with my code I got:

words Count
aaa 1


I know
stri_extract
should identify
abc
within
abc hello
so perhaps the error happens when I use
rbind
?

Answer

You need to move the write.csv file out of the loop, otherwise it will override the previously saved file and you will only get the file saved at the final stage. By doing that, you will have to rbind your result outside lapply, since you can't modify the result variable in the function.

result <- do.call(rbind, lapply(list$list, function(x){
                                t <- data.frame(words = stri_extract(df$data, coll=x))
                                t<- setDT(t)[, .( Count = .N), by = words]
                                t<-t[complete.cases(t$words)]
                                t
 }))

write.csv(result, "new.csv", row.names = F)
Comments