user235852 - 1 year ago 47

R Question

I have a data.table like this

`dput(DT)`

structure(list(ref = c(3L, 3L, 3L, 3L), nb = 12:15, i1 = c(3.1e-05,

0.044495, 0.82244, 0.322291), i2 = c(0.000183, 0.155732, 0.873416,

0.648545), i3 = c(0.000824, 0.533939, 0.838542, 0.990648), i4 = c(0.044495,

0.82244, 0.322291, 0.393595)), .Names = c("ref", "nb", "i1",

"i2", "i3", "i4"), row.names = c(NA, -4L), class = c("data.table",

"data.frame"), .internal.selfref = <pointer: 0x0000000000320788>)

DT

# ref nb i1 i2 i3 i4

# 1: 3 12 0.000031 0.000183 0.000824 0.044495

# 2: 3 13 0.044495 0.155732 0.533939 0.822440

# 3: 3 14 0.822440 0.873416 0.838542 0.322291

# 4: 3 15 0.322291 0.648545 0.990648 0.393595

Now I want to calculate rows sums, but only including columns which start with an "i" ("i1", "i2", etc)

I have used

`grep`

`listCol <- colnames(DT)[grep("i", colnames(DT))]`

listCol

# [1] "i1" "i2" "i3" "i4"

Then I have tried to loop over columns:

`DT$sum <- rep.int(0, nrow(DT))`

for (i in listCol){

DT$sum = DT$sum + DT[ , get(i)]

}

...which gives the desired output:

`DT`

# ref nb i1 i2 i3 i4 sum

# 1: 3 12 0.000031 0.000183 0.000824 0.044495 0.045533

# 2: 3 13 0.044495 0.155732 0.533939 0.822440 1.556606

# 3: 3 14 0.822440 0.873416 0.838542 0.322291 2.856689

# 4: 3 15 0.322291 0.648545 0.990648 0.393595 2.355079

How can I improve my code? Many thanks!

This sub question include partially the answer to the previous one :

How to avoid this kind of strange notation :

`myrowMeans = function (x){`

rowMeans(x, na.rm = TRUE)

}

DT[ , var := myrowMeans(.SD-myrowMeans(.SD)^2), .SDcols = grep("i", colnames(DT))]

Answer Source

You may also try with `Reduce`

```
DT[, Sum := Reduce(`+`, .SD), .SDcols=listCol][]
# ref nb i1 i2 i3 i4 Sum
#1: 3 12 0.000031 0.000183 0.000824 0.044495 0.045533
#2: 3 13 0.044495 0.155732 0.533939 0.822440 1.556606
#3: 3 14 0.822440 0.873416 0.838542 0.322291 2.856689
#4: 3 15 0.322291 0.648545 0.990648 0.393595 2.355079
```

**NOTE:** If there are "NA" values, it should be replaced with '0' before `Reduce`

i.e.

```
DT[, Sum := Reduce(`+`, lapply(.SD, function(x) replace(x,
which(is.na(x)), 0))), .SDcols=listCol][]
```