Chris - 5 months ago 29

R Question

I want to perform an IDW cross-validation and find out which "power"-value gives the smallest RMSE. In order to do this, I want to store the "power" and "RMSE"-values in a list and sort them by the smallest RMSE, for example

I'd like something like this:

`RMSE Power`

[1,] 1.230 2.5

[2,] 1.464 1.5

[3,] 1.698 2.0

[4,] 1.932 3.0

What I have so far is this:

`require(sp)`

require(gstat)

data("meuse")

#### create grid:

pixels <- 500 #define resolution

#define extent

raster.grd <- expand.grid(x=seq(floor(min(x=meuse$x)),

ceiling(max(x=meuse$x)),

length.out=pixels),

y=seq(floor(min(y=meuse$y)),

ceiling(max(y=meuse$y)),

length.out=pixels))

# convert the dataframe to a spatial points and then to a spatial pixels

grd.pts <- SpatialPixels(SpatialPoints((raster.grd)))

grd <- as(grd.pts, "SpatialGrid")

gridded(grd) = TRUE

#### perform IDW and loop through different power-values

power = seq(from = 1.5, to = 3, by = 0.5)

results=list()

results.cv=list()

for(i in power) {

results[[paste0(i,"P")]] <- gstat::idw(meuse$zinc ~ 1, meuse, grd, idp = i)

results.cv[[paste0(i,"P")]] <- krige.cv(zinc ~ 1, meuse, nfold = nrow(meuse),set = list(idp = i))

}

Now my attempt to calculate and store the RMSE with a for-loop:

`results_rmse <- list()`

pwr <- names(results.cv)

for(i in results.cv){ #for each Element (1.5P, 2P, etc) in results.cv

for(j in 1:length(pwr)){ #for each Power

results_rmse <- sqrt(mean(i$residual^2))

print(pwr[j])

}

print(paste("RMSE",results_rmse))

}

But with this loop, it prints each RMSE individually. So I changed the code like this

`results_rmse[[i]] <- sqrt(mean(i$residual^2))`

But then I get an error

`Error in results_rmse[[i]] <- sqrt(mean(i$residual^2)) : invalid subscript type 'S4'`

I tried several versions of the for-loop, but I couldn't even figure out how to store the values in a list, not to mention to sort them by the smallest RMSE.

Answer

There is an extra loop for `j`

in the RMSE calculation that is not needed as far as I understand the problem. Also, I rearranged the loop in such a way that it cycles through a sequence of elements rather than calling them by their names.

```
# Data, because your script doesn't run for me. The rest is identical from your code
for(i in power) {
results.cv[[paste0(i,"P")]]$residual <- rnorm(50)
}
# Fixed loop
for(i in 1:length(results.cv)){
results_rmse[[i]] <- sqrt(mean(results.cv[[i]]$residual^2))
}
names(results_rmse) <- names(results.cv)
```

Alternatively, the `for`

loop can be avoided with the `apply`

function. The result is a named list corresponding to the input names, so the last line can be omitted to achieve the same `results_rmse`

.

```
results_rmse <- lapply(results.cv, function(x) sqrt(mean(x$residual^2)))
```

To print the data as you showed in your question:

```
cbind(RMSE=unlist(results_rmse), Power=power)
```