BigFinger - 5 months ago 13

R Question

I have a sparse data table that looks like this:

`data = data.table(`

var1 = c("a","",""),

var2 = c("","","c"),

var3 = c("a","b",""),

var4 = c("","b","")

)

var1 var2 var3 var4

1: a a

2: b b

3: c

I would like to add a column that contains a string of zeros and ones indicating which variables are present in any row, like this:

`var1 var2 var3 var4 concat`

1: a a 1|0|1|0

2: b b 0|0|1|1

3: c 0|1|0|0

I can get to this with the following command:

`data[, concat := paste(`

as.integer(var1 != ""),

as.integer(var2 != ""),

as.integer(var3 != ""),

as.integer(var4 != ""),

sep = "|")]

However, if I have hundreds of variables, I would rather use some sort of computation to get to the desired expression. Perhaps something based on

`paste0("var",1:4)`

Answer

Same basic approach as above:

```
data[ , concat := apply(.SD, 1, function(x) paste(+(x == ""), collapse = "|"))][]
# var1 var2 var3 var4 concat
# 1: a a 0|1|0|1
# 2: b b 1|1|0|0
# 3: c 1|0|1|1
```