Kirk Fogg - 1 year ago 74

R Question

I am trying to partition observations in a data frame into 36 groups, based on two continuous variables. More specifically, I am trying to cut each of the two variables into six groups, and then group the observations in one of the 36 different possible groups.

My attempt is below, which works. But is there a faster way to do this that avoids the double for loops?

Also, this isn't necessary, but how could I visualize the total number of observations in each group in a 6 by 6 grid? I know table() would produce a list of the 36 possible groups and their totals, but not in grid format.

`set.seed(123)`

x1 <- rnorm(1000)

x2 <- rnorm(1000)

data <- data.frame(x1,x2)

labs1 <- levels(cut(x1, 6))

ints1 <- cbind(lower = as.numeric(sub("\\((.+),.*", "\\1", labs1)),

upper = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", labs1)))

labs2 <- levels(cut(x2, 6))

ints2 <- cbind(lower = as.numeric(sub("\\((.+),.*", "\\1", labs2)),

upper = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", labs2)))

tmp <- expand.grid(labs1, labs2)

groups <- cbind(lower1 = as.numeric(sub("\\((.+),.*", "\\1", tmp[,1])),

upper1 = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", tmp[,1])),

lower2 = as.numeric(sub("\\((.+),.*", "\\1", tmp[,2])),

upper2 = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", tmp[,2])))

for (i in 1:1000){

for (j in 1:36){

if (x1[i] >= groups[j,1] & x1[i] <= groups[j,2] &

x2[i] >= groups[j,3] & x2[i] <= groups[j,4]){

data$group[i] <- j

}

}

}

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

You can use a mix of `apply()`

that will iterate thru your `data.frame`

and `which()`

that will iterate thru your groups `array`

:

```
data$group <- apply(data, 1, FUN=function(dataRow)
which(
dataRow[1] >= groups[,1] &
dataRow[1] <= groups[,2] &
dataRow[2] >= groups[,3] &
dataRow[2] <= groups[,4]))
```

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**