Vinterwoo Vinterwoo - 9 months ago 28
R Question

A more efficent way to calculate a boolean column

I have a data frame that includes factors with comma separated values. My apologies for not supplying a reproducible example, but my data ends up looking like this:

Col_1 Col_2 Col_3

1 0 0
0 0 1
2 0 0
1 2,2 2
3 0 1,2


Because I have these comma separated values, I haven't been able to write up a speedy bracketed notation approach that R is awesome at. So I have had to write a for loop to loop through my data frame and change all non-zero entries to a 1.

for( i in seq(1:nrow(DF))){
if(DF$Col_2 ==0){
DF$NewCol[i] == 0}
else {
DF$NewCol[i] == 1}


The above works, but takes way too long. Is there a way to speed this up using a different approach in R?

Answer Source

Try this:

DF <- read.table(text="Col_1   Col_2   Col_3
1        0        0
0        0        1
2        0        0
1        2,2      2
3        0        1,2", header=TRUE, stringsAsFactors=FALSE)

DF$NewCol <-ifelse(DF$Col_2 ==0,0,1)
> DF
  Col_1 Col_2 Col_3 NewCol
1     1     0     0      0
2     0     0     1      0
3     2     0     0      0
4     1   2,2     2      1
5     3     0   1,2      0