Chris Chris - 3 months ago 18
R Question

R: How to calculate and sort two variables with a for loop

I want to perform an IDW cross-validation and find out which "power"-value gives the smallest RMSE. In order to do this, I want to store the "power" and "RMSE"-values in a list and sort them by the smallest RMSE, for example

I'd like something like this:

RMSE Power
[1,] 1.230 2.5
[2,] 1.464 1.5
[3,] 1.698 2.0
[4,] 1.932 3.0


What I have so far is this:

require(sp)
require(gstat)
data("meuse")

#### create grid:
pixels <- 500 #define resolution
#define extent
raster.grd <- expand.grid(x=seq(floor(min(x=meuse$x)),
ceiling(max(x=meuse$x)),
length.out=pixels),
y=seq(floor(min(y=meuse$y)),
ceiling(max(y=meuse$y)),
length.out=pixels))
# convert the dataframe to a spatial points and then to a spatial pixels
grd.pts <- SpatialPixels(SpatialPoints((raster.grd)))
grd <- as(grd.pts, "SpatialGrid")
gridded(grd) = TRUE

#### perform IDW and loop through different power-values
power = seq(from = 1.5, to = 3, by = 0.5)
results=list()
results.cv=list()
for(i in power) {
results[[paste0(i,"P")]] <- gstat::idw(meuse$zinc ~ 1, meuse, grd, idp = i)
results.cv[[paste0(i,"P")]] <- krige.cv(zinc ~ 1, meuse, nfold = nrow(meuse),set = list(idp = i))
}


Now my attempt to calculate and store the RMSE with a for-loop:

results_rmse <- list()
pwr <- names(results.cv)

for(i in results.cv){ #for each Element (1.5P, 2P, etc) in results.cv
for(j in 1:length(pwr)){ #for each Power
results_rmse <- sqrt(mean(i$residual^2))
print(pwr[j])
}
print(paste("RMSE",results_rmse))
}


But with this loop, it prints each RMSE individually. So I changed the code like this

results_rmse[[i]] <- sqrt(mean(i$residual^2))


But then I get an error

Error in results_rmse[[i]] <- sqrt(mean(i$residual^2)) : invalid subscript type 'S4'


I tried several versions of the for-loop, but I couldn't even figure out how to store the values in a list, not to mention to sort them by the smallest RMSE.

nya nya
Answer

There is an extra loop for j in the RMSE calculation that is not needed as far as I understand the problem. Also, I rearranged the loop in such a way that it cycles through a sequence of elements rather than calling them by their names.

# Data, because your script doesn't run for me. The rest is identical from your code
for(i in power) {
    results.cv[[paste0(i,"P")]]$residual <- rnorm(50)
}

# Fixed loop
for(i in 1:length(results.cv)){ 
    results_rmse[[i]] <- sqrt(mean(results.cv[[i]]$residual^2))
}
names(results_rmse) <- names(results.cv)

Alternatively, the for loop can be avoided with the apply function. The result is a named list corresponding to the input names, so the last line can be omitted to achieve the same results_rmse.

results_rmse <- lapply(results.cv, function(x) sqrt(mean(x$residual^2)))

To print the data as you showed in your question:

cbind(RMSE=unlist(results_rmse), Power=power)