ErrantBard - 5 months ago 29

R Question

I have som data in a data.table that I want to to sum. Trivial example supplied below. I start with (variable) n-columns (char) and some group variables.

I can convert them (nsk-vars in my example) to numeric/integer beforehand, if that's the way to go. I would like to be able to use a string of columnames to create a new column which is the sum of these(nsk1-nsk3) and handle NA:s. How do I do this in a resonable way?

`library(data.table)`

x <- data.table(a = c("GrpA", "GrpB"),

nsk1 = c("1","3"),

nsk2 = c(NA,"1"),

nsk3 = c("3", "4"))

ClNamesStr <- colnames(x)

ClNamesStr <- ClNamesStr[grepl("^nsk", ClNamesStr)]

nskStrSum <- paste(ClNamesStr, collapse = "),as.numeric(")

nskStrSum <- paste("as.numeric(", nskStrSum, "), na.rm = TRUE")

This is the part that I would like to get working.

`x[, nsk0 := sum(eval(nskStrSum)), by = a]`

The desired output would be this:

`a nsk1 nsk2 nsk3 nsk0`

GrpA 1 NA 3 4

GrpB 3 1 4 7

Answer

Here is one way to go.

```
x[, nsk0 := rowSums(sapply(.SD, function(i) as.numeric(i)), na.rm=TRUE),
.SDcols=grep("^nsk", names(x)), by = a]
```

which returns

```
x
a nsk1 nsk2 nsk3 othr nsk0
1: GrpA 1 NA 3 a 4
2: GrpA 3 1 4 b 8
```

`sapply`

runs through the selected variables, converts each to a numeric and returns a matrix. This matrix is passed to `rowSums`

which sums up the values in each row and ignores the `NA`

s. `.SD`

is a shorthand to select of the data.table, (except variables contained in "by"). However, the variables are selected among them using `.SDcols`

along with `grep`

.

Some data.tablers might not be happy with this solution as it converts some of the data into a matrix. However, this conversion may be necessary in your case.

I added an additional "nuisance" variable to show that it will work.

**data**

```
x <- data.table(a = c("GrpA", "GrpA"),
nsk1 = c("1","3"),
nsk2 = c(NA,"1"),
nsk3 = c("3", "4"),
othr = letters[1:2])
```