user1783739 user1783739 - 3 months ago 10
R Question

ifelse not looping over rows as expected

I have data that looks like this:

df <- read.table(tc <- textConnection("
var1 var2 var3 var4
1 1 7 NA
4 4 NA 6
2 NA 3 NA
4 4 4 4
1 3 1 1"), header = TRUE); close(tc)


I'm trying to create a new column that returns 1 if there's a match or 0 if none.

My non-working code looks like this:

df$var5 = ifelse("1" %in% df$var1,1,
ifelse("1" %in% df$var2,1,
ifelse("1" %in% df$var3,1,
ifelse("1" %in% df$var4,1,0))))


giving me a table:

var1 var2 var3 var4 var5
1 1 7 NA 1
4 4 NA 6 1
2 NA 3 NA 1
4 4 4 4 1
1 3 1 1 1


The table I actually want should look like

var1 var2 var3 var4 var5
1 1 7 NA 1
4 4 NA 6 0
2 NA 3 NA 0
4 4 4 4 0
1 3 1 1 1


I've looked at the posts:

ifelse not working as expected in R

and

Loop over rows of dataframe applying function with if-statement

but I couldn't get any answer to my problem.

Answer

The correct way should be

with(df, ifelse(var1 %in% 1,1,
            ifelse(var2 %in% 1,1,
                  ifelse(var3 %in% 1,1,
                       ifelse(var4 %in% 1,1,0)))))
#[1] 1 0 0 0 1

The reason is that 1 %in% df1$var1 returns only a single element that 1.

1 %in% df$var1
#[1] TRUE

likewise, in all all the columns, there is 1, so it will return TRUE for all the ifelse, resulting in value 1.

whereas the opposite

df$var1 %in% 1
#[1]  TRUE FALSE FALSE FALSE  TRUE

returns the logical vector with the same length as the original column. In essence, by using %in%, the length returned will be based on the length of the object in the lhs of %in%


It is not required to have ifelse, a better option would be, using rowSum on the logical matrix (df ==1), and check whether it is not equal to 0, convert to binary with as.integer.

as.integer(rowSums(df == 1, na.rm =TRUE)!=0)
#[1] 1 0 0 0 1

Or another option is Reduce with |

as.integer(Reduce(`|`, lapply(replace(df, is.na(df), 0), `==`, 1)))
#[1] 1 0 0 0 1