user2386222 user2386222 - 3 months ago 17
R Question

Create a new variable from existing variable in dataframe using condition

I have a big data frame like the example below.

df <- data.frame(IND= seq(1:20), S = LETTERS[1:20], FA=c(0,0,133,0,2,2,2,0,0,4,4,4,6,6,0,0,0,4,2,8),
MO = c(77,0,77,1,3,1,1,1,0,3,1,5,5,3,0,0,100,3,5,5)
)


I need create two new variables (SFA and SMO) when the FA and MO are equals IND . I need the below output

out<- df <- data.frame(IND= seq(1:20),
S = LETTERS[1:20],
FA=c(0,0,133,0,2,2,2,0,0,4,4,4,6,6,0,0,0,4,2,8),
MO = c(77,0,77,1,3,1,1,1,0,3,1,5,5,3,0,0,100,3,5,5),
SFA=c(0,0,133,0,"B","B","B",0,0,"D","D","D","F","F",0,0,0,"D","B","H"),
SMO=c(77,0,77,"A","C","A","A","A",0,"C","A","E","E","C",0,0,100,"C","E","E"))


I tried match the variables and after merge, but did not work very well.

THanks

Answer

To pick up corresponding values from S where FA(MO) == IND, you can use match function to find out the index and subset from S as S[match(FA, IND)](S[match(MO, IND)]) and then use coalesce function to fill NAs in the match result with values from original vectors:

library(dplyr)
df %>% mutate(SFA = coalesce(as.character(S[match(FA, IND)]), as.character(FA)), 
              SMO = coalesce(as.character(S[match(MO, IND)]), as.character(MO)))

#   IND S  FA  MO SFA SMO
#1    1 A   0  77   0  77
#2    2 B   0   0   0   0
#3    3 C 133  77 133  77
#4    4 D   0   1   0   A
#5    5 E   2   3   B   C
#6    6 F   2   1   B   A
#7    7 G   2   1   B   A
#8    8 H   0   1   0   A
# ...