BWRT BWRT - 3 months ago 8
R Question

R Iterating a model across data table columns

I'm using a model to look at GHG emissions from crop areas. To try and gauge a handle on the standard deviation of the data I'm trying to run a Monte Carlo style analysis on it through multiple iterations.

model parameters
a <- 0.1474 # Alpha
b <- 0.0005232 # Beta
g <- -0.00001518 # Gamma
d <- 0.000003662 # Delta
rain <- crm$rain # rainfall value for that location from the col 'rain'


The data is in a
data.table
as below but the N columns run from N1-N100:

rn rain Wheat N1 N2 N3 N4 N5 N6
# 1: 10007 1049.61 0.1718 0.6363109 0.939479 0.9242736 0.9018818 0.6556216 0.1150655
# 2: 10018 1114.31 0.1629 0.6363109 0.939479 0.9242736 0.9018818 0.6556216 0.1150655
# 3: 10023 1361.61 0.1082 0.6363109 0.939479 0.9242736 0.9018818 0.6556216 0.1150655
# 4: 10024 1407.20 0.0494 0.6363109 0.939479 0.9242736 0.9018818 0.6556216 0.1150655
# 5: 10025 1499.56 0.0200 0.6363109 0.939479 0.9242736 0.9018818 0.6556216 0.1150655
# 6: 10026 1654.13 0.0040 0.6363109 0.939479 0.9242736 0.9018818 0.6556216 0.1150655


So my question is how do I apply my model below to each N column and add the result to the end of the data table? The model works with a fixed value for N but I'm struggling on how to get the value from each column into the model.

logN2O <- function (x) {a+(b*rain)+(g*N)+(d*rain*N)}


Many thanks in advance.

Edit

To clarify I want to run the model with the value for N1 first and make a new col with that result at the end. Then do the same for the N2 value and so on to the end of the N columns.

Answer

I think this should work:

n <- 1:6
cols <- paste0("N",n,"_res")
dt[,(cols) := lapply(.SD ,function (x) {a + (b*dt[,rain]) + (g*x) + (d*dt[,rain]*x)}), .SDcols = paste0("N",n)]

Basically you just specify the "n"s you want to loop through (in this case N1 - N6) and then it stores the result with "_res" attached - e.g. "N1_res".

Data:

dt <- structure(list(rn = c(10007L, 10018L, 10023L, 10024L, 10025L, 
10026L), rain = c(1049.61, 1114.31, 1361.61, 1407.2, 1499.56, 
1654.13), Wheat = c(0.1718, 0.1629, 0.1082, 0.0494, 0.02, 0.004
), N1 = c(0.6363109, 0.6363109, 0.6363109, 0.6363109, 0.6363109, 
0.6363109), N2 = c(0.939479, 0.939479, 0.939479, 0.939479, 0.939479, 
0.939479), N3 = c(0.9242736, 0.9242736, 0.9242736, 0.9242736, 
0.9242736, 0.9242736), N4 = c(0.9018818, 0.9018818, 0.9018818, 
0.9018818, 0.9018818, 0.9018818), N5 = c(0.6556216, 0.6556216, 
0.6556216, 0.6556216, 0.6556216, 0.6556216), N6 = c(0.1150655, 
0.1150655, 0.1150655, 0.1150655, 0.1150655, 0.1150655)), .Names = c("rn", 
"rain", "Wheat", "N1", "N2", "N3", "N4", "N5", "N6"), class = c("data.table", 
"data.frame"), row.names = c(NA, -6L))