user6475 - 9 months ago 91

R Question

I try to subset my dataset using a nested loop. Unfortunately, it does not seem to work out properly: I get a couple of warnings and the loop is also not working as I would wish.

Here a short code example. The presented data is just an example - the actual dataset is much bigger: Any solution that involves manually picking values is not feasible.

`# #Generate example data`

unique_test <- list()

unique_test[[1]] <- c(178.5, 179.5, 180.5, 181.5)

unique_test[[2]] <- c(269.5, 270.5, 271.5)

tmp_dataframe1 <- data.frame(myID = c(268, 305, 268, 305, 268, 305, 306),

myvalue = c(1.150343, 2.830392, 1.150343, 2.830392, 1.150343, 2.830392, 1.150343),

myInter = c(178.5, 178.5, 179.5, 179.5, 180.5, 180.5, 181.5))

tmp_dataframe2 <- data.frame(myID = c(144, 188, 196, 300, 301, 302, 303, 97),

myvalue = c(1.293493, 3.286649, 1.408049, 0.469219, 11.143147, 0.687355, 0.508603, 0.654335),

myInter = c(269.5, 269.5, 269.5, 270.5, 270.5, 271.5, 185.5, 186.5))

mydata <- list()

mydata[[1]] <- tmp_dataframe1

mydata[[2]] <- tmp_dataframe2

########################

# #Generate nested loop

mysubset <- list() #Define list

for(i in 1:length(unique_test)){

#Prepare list of lists

mysubset[[i]] <- NaN

for(j in 1:length(unique_test[[i]])){

#Select myvalues whose myInter data equals the one found in unique_test and assign them to a new subset

mysubset[[i]][j] <- mydata[[i]][which(mydata[[i]]$myInter == unique_test[[i]][j]),][["myvalue"]]

}

}

# #There are warnings and the nested loop is not really doing, what it is supposed to do!

R gives the following warnings:

`Warning messages:`

1: In mysubset[[i]][j] <- mydata[[i]][which(mydata[[i]]$myInter == :

number of items to replace is not a multiple of replacement length

2: In mysubset[[i]][j] <- mydata[[i]][which(mydata[[i]]$myInter == :

number of items to replace is not a multiple of replacement length

3: In mysubset[[i]][j] <- mydata[[i]][which(mydata[[i]]$myInter == :

number of items to replace is not a multiple of replacement length

4: In mysubset[[i]][j] <- mydata[[i]][which(mydata[[i]]$myInter == :

number of items to replace is not a multiple of replacement length

5: In mysubset[[i]][j] <- mydata[[i]][which(mydata[[i]]$myInter == :

number of items to replace is not a multiple of replacement length

If I restrict myself to just the first element in my dataset, the "normal" (i.e. NOT nested) loop works out:

`# #If I don't use a nested loop (by just using the first element in both "mydata" and "unique_test"), things seem to work out`

# #But obviously, this is not really what I want to achieve (I can't just manually select every element in mydata and unique_test)

mysubset <- list()

for(i in 1:length(unique_test[[1]])){

#Select myvalues whose myInter data equals the one found in unique_test and assign them to a new subset

mysubset[[i]] <- mydata[[1]][which(mydata[[1]]$myInter == unique_test[[1]][i]),][["myvalue"]]

}

Could it be that I first have to initiate my list with the appropriate dimensions? But how would I do that, if the dimensions are NOT the same for all the elements in my dataset (that's why I have to use the length() function in the first place)?

As you can see mydata[[1]] has not the same dimensions as mydata[[2]].

Therefore the solutions presented in the following links do not apply to this dataset:

Error in R :Number of items to replace is not a multiple of replacement length

Error in `*tmp*`[[k]] : subscript out of bounds in R

I'm pretty sure it's something obvious I'm missing, but I just cannot find it. Any help is much appreciated!

If there are better ways of achieving the same without a loop (I'm sure there are, e.g. apply() or something along the lines of subset()), I would appreciate such comments as well. Unfortunately I'm not familiar enough with the alternatives to be able to implement them quickly.

Answer Source

Simply wrap your assignment in `list()`

as you are attempting to assign a numeric vector to a nested list because of nested `for`

loops and not a vector itself.

```
mysubset[[i]][j] <- list(mydata[[i]][which(mydata[[i]]$myInter == unique_test[[i]][j]),][["myvalue"]])
```

Or the shorter as `which()`

is not needed nor outer square brackets:

```
mysubset[[i]][j] <- list(mydata[[i]][mydata[[i]]$myInter == unique_test[[i]][j], c("myvalue")])
```

Alternatively, consider an apply solution as you do not need to initially assign an empty list and expand it iteratively to bind values to it. Nested `lapply`

, `sapply`

, `mapply`

, even `rapply`

can create the needed lists and dimensions in one call. The `mapply`

assumes *unique_test* and *mydata* are always equal length objects.

```
# NESTED LAPPLY
mysubset2 <- lapply(seq(length(unique_test)), function(i) {
lapply(seq(length(unique_test[[i]])), function(j){
mydata[[i]][mydata[[i]]$myInter == unique_test[[i]][j], c("myvalue")]
})
})
# NESTED SAPPLY
mysubset3 <- sapply(seq(length(unique_test)), function(i) {
sapply(seq(length(unique_test[[i]])), function(j){
mydata[[i]][mydata[[i]]$myInter == unique_test[[i]][j], c("myvalue")]
})
}, simplify = FALSE)
# NESTED M/LAPPLY
mysubset4 <- mapply(function(u, m){
lapply(u, function(i) m[m$myInter == i, c("myvalue")])
}, unique_test, mydata, SIMPLIFY = FALSE)
# NESTED R/LAPPLY
mysubset5 <- rapply(unique_test, function(i){
df <- do.call(rbind, mydata)
lapply(i, function(u) df[df$myInter == u, c("myvalue")])
}, how="list")
# ALL SUBSETS EQUAL EXACTLY
all.equal(mysubset, mysubset2)
# [1] TRUE
all.equal(mysubset, mysubset3)
# [1] TRUE
all.equal(mysubset, mysubset4)
# [1] TRUE
all.equal(mysubset, mysubset5)
# [1] TRUE
```