RTrain3k RTrain3k - 3 months ago 7
R Question

Create variable using multiple conditions

I would like to create the variable,

NewVar
, in the data frame
A
and set it equal to 1 if the two conditions below are met.


  1. Var0==
    the number in column names
    Var(i)
    , e.g., if
    Var0=4
    and
    Var4

  2. The variable Var(i) is not equal to 0:
    !Var(i)==0





Below is a schematic of what I would like to achieve:

A <- read.table(text=" Var0 Var1 Var2 Var3 Var4 NewVar
4 0 0 0 1 1
4 0 0 0 0 0
2 0 1 0 0 1
2 0 0 0 0 0
1 1 0 0 0 1
1 0 0 0 0 0
3 0 0 1 0 1
3 0 0 0 0 0", header=T)


I've been trying to use something like:

A$NewVar <- for (var in names(A[ ,2:5])) {
ifelse(A$Var0==grep("var", colnames(A)) & A$var==1, 1, 0)
}


To access the column index but it does not work.

In Excel I would use a match statement to return the column index of the 1 in variables Var1-4, and an if statement to test if the column index equals the value in Var0. If it does, NewVar=1, else 0.

enter image description here

Hopefully this makes what I am trying to do clearer. I am trying to migrate from Excel to R!

Answer

Here are two approaches that assume:

  1. Column names are as you've stated (Var1, Var2, etc).
  2. You can just use the value in relevant cell (0 or 1).

First option is using a for loop, second is using apply():

A <- read.table(text="  Var0    Var1    Var2    Var3    Var4    NewVar  
          4 0   0   0   1   1   
          4 0   0   0   0   0   
          2 0   1   0   0   1   
          2 0   0   0   0   0   
          1 1   0   0   0   1   
          1 0   0   0   0   0   
          3 0   0   1   0   1   
          3 0   0   0   0   0", header=T)

# Using a for loop...
col_to_match <- paste0("Var", A$Var0)
for(i in seq(col_to_match)) {
   A[i, "NewVar2"] <- A[i, col_to_match[i]]
}

# Using apply()
A$NewVar3 <- apply(A, 1, function(i) {
  col_to_match <- paste0("Var", i["Var0"])
  i[col_to_match]
})

A
#>   Var0 Var1 Var2 Var3 Var4 NewVar NewVar2 NewVar3
#> 1    4    0    0    0    1      1       1       1
#> 2    4    0    0    0    0      0       0       0
#> 3    2    0    1    0    0      1       1       1
#> 4    2    0    0    0    0      0       0       0
#> 5    1    1    0    0    0      1       1       1
#> 6    1    0    0    0    0      0       0       0
#> 7    3    0    0    1    0      1       1       1
#> 8    3    0    0    0    0      0       0       0

Just change "NewVar2" or "NewVar3" to "NewVar" (I just added the numbers to demonstrate).

If you really need to check whether the value != 0, then add that to the relevant lines and add as.numeric() to get from boolean to 0/1. E.g., in the for loop section above:

A[i, "NewVar2"] <- as.numeric(A[i, col_to_match[i]] != 0)

or in the apply() section:

as.numeric(i[col_to_match] != 0)