Osprey Eagle Osprey Eagle - 1 year ago 41
R Question

How to update (assign new values) to R data frames stored in a list

# sample data
options(stringsAsFactors = FALSE)

v1 = stringi::stri_rand_strings(4,3)
v2 = rep("",4)
df1 = data.frame(v1, v2)

v1 = stringi::stri_rand_strings(4,3)
v2 = rep("",4)
df2 = data.frame(v1, v2)

df.list = list(df1,df2)

v1 v2
2 uCt
3 wed
4 3CA

v1 v2
1 BhZ
2 Aww
3 8pT

I want to assign a substring of v1 to v2 for every row of every data frame in a vectorised manner, e.g., v2 = the third character of v1, to get this:

> df.list
v1 v2
2 uCt t
3 wed d
4 3CA A

v1 v2
1 BhZ Z
2 Aww w
3 8pT T

I know this for-loop works

for (df in 1:2){
df.list[[df]]$v2 = substr(df.list[[df]]$v1, 3, 3)

I know I could use
and then set
$v2 = substr($v1, 3, 3)

I know I could substring before storing the data frame in the list, but I'd rather substring all at once.

I'd like to keep the data in a list b/c the list is indexed by a string that will be used in other code. The rbind.fill does not keep the index / rowname.

I know this does NOT work

sapply(df.list, "[[", "v2") <- sapply(df.list, function(x) substr(x$v1, 3,3))

Even though the right side identifies the correct substrings. I realize the sapply on the left side is an output function and does not point to the target. But this conveys the idea of what I'm trying to do.

This also generates the substring
sapply(df.list, function(x) {x$v2 <- substr(x$v1,3,3)})
but the assignment does not get made.

So how do I point to the same column of every structurally equivalent data frame stored in a list to make the assignment in a vectorized manner?

Answer Source

Using lapply lets you apply functions easily over each element in a list. Heres a solution using lapply and dplyr's mutate function.

lapply(df.list, function(df) dplyr::mutate(df, v2=substr(v1,3,3)))

Alternate solutions using base R.

lapply(df.list, function(df) data.frame(v1=df$v1, v2=substr(df$v1,3,3)))

lapply(df.list, function(df) {
  df$v2 <- substr(df$v1,3,3)