Kirk Fogg - 8 months ago 24

R Question

I am trying to partition observations in a data frame into 36 groups, based on two variables. More specifically, I am trying to cut each of the two variables into six groups, and then group the observations in one of the 36 different possible groups.

My attempt is below, which works. But is there a faster way to do this that avoids the double for loops?

Also, this isn't necessary, but how could I visualize the total number of observations in each group in a 6 by 6 grid? I know table() would produce a list of the 36 possible groups and their totals, but not in grid format.

`set.seed(123)`

x1 <- rnorm(1000)

x2 <- rnorm(1000)

data <- data.frame(x1,x2)

labs1 <- levels(cut(x1, 6))

ints1 <- cbind(lower = as.numeric(sub("\\((.+),.*", "\\1", labs1)),

upper = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", labs1)))

labs2 <- levels(cut(x2, 6))

ints2 <- cbind(lower = as.numeric(sub("\\((.+),.*", "\\1", labs2)),

upper = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", labs2)))

tmp <- expand.grid(labs1, labs2)

groups <- cbind(lower1 = as.numeric(sub("\\((.+),.*", "\\1", tmp[,1])),

upper1 = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", tmp[,1])),

lower2 = as.numeric(sub("\\((.+),.*", "\\1", tmp[,2])),

upper2 = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", tmp[,2])))

for (i in 1:1000){

for (j in 1:36){

if (x1[i] >= groups[j,1] & x1[i] <= groups[j,2] &

x2[i] >= groups[j,3] & x2[i] <= groups[j,4]){

data$group[i] <- j

}

}

}

Answer Source

You can use a mix of `apply()`

that will iterate thru your `data.frame`

and `which()`

that will iterate thru your groups `array`

:

```
data$group <- apply(data, 1, FUN=function(dataRow)
which(
dataRow[1] >= groups[,1] &
dataRow[1] <= groups[,2] &
dataRow[2] >= groups[,3] &
dataRow[2] <= groups[,4]))
```