Abhijeet Arora Abhijeet Arora - 4 months ago 10
R Question

How to make specific columns as.numeric in R's data.table and keep the rest as it is?

Suppose i have a following db:

db<-data.frame(para=c(round(rnorm(20,10,10),0)),sal1=c(rnorm(20,100,7)),sal2=c(rnorm(20,100,7)),sal3=c(rnorm(10,100,7)),sal4=rep(c("a","b"),5))

para sal1 sal2 sal3 sal4
1 -3 89.72090 105.79164 101.09462 a
2 3 102.64036 104.07501 96.41335 b
3 11 104.65196 90.49886 101.81897 a
4 27 99.61455 102.23207 108.41161 b
5 24 101.18734 98.16081 103.04760 a


and i want only sal1,sal2,sal3 as numeric and rest as is.
It should be generalised as i have 118 columns that i want as numeric and want to keep the rest as is.

I tried:

check<-names(db)
db<-db[as.numeric(get(check[which(check=="sal1"):(which(check=="sal1")+2)]))]


But i think this is just a shot in the dark.

Answer

We can use grep to select the columns that start with 'sal', use that index to subset the 'db', loop through the columns, and convert to numeric assign the output to the 'db[nm1]`

 nm1 <- grep("^sal\\d+", names(db))
 db[nm1] <- lapply(db[nm1], as.numeric)

If we need it in data.table, convert the 'data.frame' to 'data.table' (setDT(db)), specify the columns in .SDcols, loop through the Subset of Data.table (.SD), convert to numeric and assign (:=) it back to same column names.

library(data.table)
setDT(db)[, (nm1) := lapply(.SD, as.numeric), .SDcols = nm1]