Canovice Canovice - 4 months ago 9
R Question

Learning data.table - how to update values by row number and column name

My last few posts have been written poorly, so I will attempt to do a better and cleaner job this time.

I am learning how to work with the data tables object, and one task I am struggling with is updating values in the data table by both row number and column name at the same time. With data.frames this is much easier and I just do the following:

my_df = as.data.frame(matrix(ncol = 10, nrow = (100)))
names(my_df) = c("P1", "P2", "P3", "P4", "P5", "Q1", "Q2", "Q3", "Q4", "Q5")
head(my_df)

P1 P2 P3 P4 P5 Q1 Q2 Q3 Q4 Q5
1 NA NA NA NA NA NA NA NA NA NA
2 NA NA NA NA NA NA NA NA NA NA
3 NA NA NA NA NA NA NA NA NA NA
4 NA NA NA NA NA NA NA NA NA NA
5 NA NA NA NA NA NA NA NA NA NA
6 NA NA NA NA NA NA NA NA NA NA

replacement = c(1, 2, 3, 4, 5)
my_df[2, names(my_df)[1:5]] = replacement
head(my_df)

P1 P2 P3 P4 P5 Q1 Q2 Q3 Q4 Q5
1 NA NA NA NA NA NA NA NA NA NA
2 1 2 3 4 5 NA NA NA NA NA
3 NA NA NA NA NA NA NA NA NA NA
4 NA NA NA NA NA NA NA NA NA NA
5 NA NA NA NA NA NA NA NA NA NA
6 NA NA NA NA NA NA NA NA NA NA


so, fairly easy with a dataframe. however, I am struggling with this same exact task with a datatable. using the same structure for an example data table as i did with the dataframe above, i've tried the following:

my_dt = data.table(matrix(ncol = 10, nrow = (100)))
names(my_dt) = c("P1", "P2", "P3", "P4", "P5", "Q1", "Q2", "Q3", "Q4", "Q5")
head(my_dt)

P1 P2 P3 P4 P5 Q1 Q2 Q3 Q4 Q5
1: NA NA NA NA NA NA NA NA NA NA
2: NA NA NA NA NA NA NA NA NA NA
3: NA NA NA NA NA NA NA NA NA NA
4: NA NA NA NA NA NA NA NA NA NA
5: NA NA NA NA NA NA NA NA NA NA
6: NA NA NA NA NA NA NA NA NA NA

replacement = c(1, 2, 3, 4, 5)
# my_dt[i == 2, names(my_dt)[1:5]] = replacement
# my_dt[i == 2, names(my_dt)[1:5] := replacement]
# my_dt[2, names(my_dt)[1:5]] = replacement
# my_dt[2, names(my_dt)[1:5] := replacement]


however none of the four commented lines did the correct substitution. appreciate any help!

Thanks,
Canovice

Answer

Or you can do this:

x <- names(my_dt)[1:5]

my_dt[, (x) := lapply(.SD, as.numeric), .SDcols = x]

my_dt[2,  (x):= as.list(replacement)]

First we convert the target columns in my_dt to numeric. .SDcols represents the subset of columns in .SD that we are interested in. .SD holds all the columns in the data.table (except the ones used in by).

Once we convert the target columns to numeric, we update the values by reference.

Note: It is not necessary to define x beforehand, everything can be done on the fly. However, if you define x, you need to wrap it in () to make sure data.table doesn't look for the column x