M.H. M.H. - 9 months ago 46
R Question

Create new rows in data frame based on multiple values of column

I have adjusted my question to be a bit more specific

I have searched for a specific answer to my question but without success.

First of all I have a data frame consisting of 48 variables, which looks something likes this:

> df

Text Screen_Name ...
1 a text where @Sam and @Su and @Jim are addressed Peter
2 a text where @Eric is addressed Margret
3 a text where @Sarah and @Adam are addressed John

Now I am extracting all strings that equal ("@\S+") and store them in a new column

df$addressees <- str_extract_all(df$text, "@\\S+")

This gets me:

... Screen_Name Addressees ...
1 Peter c("@Sam", "@Su", "@Jim")
2 Margret @Eric
3 John c("@Sarah", "@Adam")

Now I want to create a new data frame for the two columns where new rows for each "Addressee" are created by repeating the respective value of column "Screen_Name":

> df

Screen_Name Addressees
1 Peter Sam
2 Peter Su
3 Peter Jim
4 Margret Eric
5 John Sarah
6 John Adam

I have tried solutions to similar approaches, but none of them seems to work.

Thank you very much for your help!

Answer Source

OK, with a reproducible example:

# create df
ego <- c("peter","margaret","john")
friends <- list(c("sam","su","jim"),c("eric"),c("sarah","adam"))
df <- data.frame(ego,friends= I(friends),stringsAsFactors = F)

# use repeat function to repeat rows
times <- sapply(df$friends,length)
df <- df[rep(seq_len(nrow(df)), times),]
# assign back unlisted friends
df$friends <- unlist(friends)