andrasz andrasz - 16 days ago 5
R Question

replace column entries by their column names

I want to replace a value (i.e. "x") in a data.table OR data.frame by its column name.

Example:

# sample data
Substance <- LETTERS[1:10]
Jan10 <- c("x","x",NA,NA,NA,"x","x","x","x",NA)
Apr10 <- c("x",NA,"x",NA,"x","x","x","x",NA,"x")
Jul10 <- c(NA,NA,NA,"x","x","x","x",NA,"x","x")
Oct10 <- c("x","x","x","x",NA,"x",NA,"x",NA,"x")

dt <- as.data.table(cbind(Substance, Jan10, Apr10, Jul10, Oct10))

# manually changing one column
dt[Jan10=="x", Jan10 := "Jan10"]


Aim: All the columns should look like the changed Jan10 column. How can I replace the "x" values in each column by the corresponding column name? Thanks for any advice.

Answer

We loop over the columns, use set to replace the NA values with column names

for(j in 2:ncol(dt)){
    set(dt, i=which(dt[[j]]=="x"), j=j, value = names(dt)[j])
 }

dt
#    Substance Jan10 Apr10 Jul10 Oct10
# 1:         A Jan10 Apr10    NA Oct10
# 2:         B Jan10    NA    NA Oct10
# 3:         C    NA Apr10    NA Oct10
# 4:         D    NA    NA Jul10 Oct10
# 5:         E    NA Apr10 Jul10    NA
# 6:         F Jan10 Apr10 Jul10 Oct10
# 7:         G Jan10 Apr10 Jul10    NA
# 8:         H Jan10 Apr10    NA Oct10
# 9:         I Jan10    NA Jul10    NA
#10:         J    NA Apr10 Jul10 Oct10