Amstell - 4 months ago 12

R Question

I'm trying to find the sum of each bin given a random vector, but the code is only returning the first element of the vector as 100. How would I cycle through each of the elements in the vector

`x`

`j`

I realize there are functions to do this in

`R`

`# Sample data`

set.seed(1234)

x <- rnorm(100)

S <- range(x)

a <- range(x)[1]

b <- range(x)[2]

J <- 5 #bins

h <- (b - a)/J #interval

for (j in 1:J){

for (n in 1:length(x)){

ifelse(x[n] > a + (j-1)*h & (x[n] <= a + j*h), n[j] <- n[j] + 1, n[j] <- n[j] + 0)

}

}

Output:

`> n`

[1] 100 NA NA NA NA

Desired Output:

`> n`

[1] 7 43 29 13 8

Answer

Why not use `cut`

and `table`

?

```
set.seed(1234)
x <- rnorm(100)
bin <- cut(x, breaks = 5) ## evenly cut `range(x)` into 5 bins
levels(bin)
# [1] "(-2.35,-1.37]" "(-1.37,-0.388]" "(-0.388,0.591]" "(0.591,1.57]"
# [5] "(1.57,2.55]"
table(bin)
# (-2.35,-1.37] (-1.37,-0.388] (-0.388,0.591] (0.591,1.57] (1.57,2.55]
# 7 43 29 13 8
```

Still, I need to show why your loop fails. Note that you don't need an `ifelse`

; ordinary `if (...) ...`

is sufficient. The error is that you used `n`

as loop index, but also use it to record counts! The following corrects this, by using a new vector `counts`

to distinguish with `n`

:

```
counts <- integer(J) ## initialization
for (j in 1:J){
for (n in 1:length(x)) {
if (x[n] > a + (j-1)*h && x[n] <= a + j*h) counts[j] <- counts[j] + 1L
}
}
counts
# [1] 6 43 29 13 7
```

Perhaps you have noted that the first value is `6`

not `7`

. This is because your loop condition `x[n] > a + (j-1)*h && x[n] <= a + j*h`

does not include the lowest value for the first bin. Since this is always the case, you need manually add a `1`

to `counts[1]`

.