notuo notuo - 1 month ago 5
R Question

Need to add a column to a data frame. The data is in another vector and is only a subset of one original DF column for reference

I am a complete newbie with R and I ask for your help.

I have a data frame DF like this:

user age email address ...
user1 20 u1@domain address1 ...
user2 19 u2@domain address2 ...
user3 30 u3@domain address3 ...
...
userm 32 um@domain addressm ...
...
usern xx un@domain address4 ...


I have a vector as following:

user1
user3
...
userm


I need to have the following:

user age email address newcol ...
user1 20 u1@domain address1 yes ...
user2 19 u2@domain address2 no ...
user3 30 u3@domain address3 yes ...
...
userm 32 um@domain addressm yes ...
...
usern xx un@domain address4 no ...


In short, add a new column to DF containing no (as default) and yes if the correspondent user is in the vector.

Any advise is appreciated,
Thanks for your time.

Answer

To expand on Joran's answer:

Assuming your data.frame is named df. Make a new column in your data.frame called newcol:

df$newcol <- 'no'

Then change the values in newcol to 'yes' if they're %in% your vector (I'm assuming its named vec).

df$newcol[df$user %in% vec] <- 'yes'

You could also do this in one step using ifelse:

df$newcol <- ifelse(df$user %in% vec, 'yes', 'no')

Or if you wanted to get tricky you could use merge(..., all=TRUE)...

Comments