Alex Reynolds Alex Reynolds -4 years ago 97
R Question

Counting values within levels in R

I have a set of levels in R that I generate with

cut
, e.g. say fractional values between 0 and 1, broken down into 0.1 bins:

> frac <- cut(c(0, 1), breaks=10)
> levels(frac)
[1] "(-0.001,0.1]" "(0.1,0.2]" "(0.2,0.3]" "(0.3,0.4]" "(0.4,0.5]"
[6] "(0.5,0.6]" "(0.6,0.7]" "(0.7,0.8]" "(0.8,0.9]" "(0.9,1]"


Given a vector
v
containing continuous values between
[0.0, 1.0]
, how do I count the frequency of elements in
v
that fall within each level in
levels(frac)
?

I could customize the number of breaks and/or the interval from which I am making levels, so I'm looking for a way to do this with standard R commands, so that I can build a two-column data frame: one column for the levels as factors, and the second column for a fractional or percentage value of total elements in
v
over the level.

Note: The following does not work:

> table(frac)
frac
(-0.001,0.1] (0.1,0.2] (0.2,0.3] (0.3,0.4] (0.4,0.5] (0.5,0.6]
1 0 0 0 0 0
(0.6,0.7] (0.7,0.8] (0.8,0.9] (0.9,1]
0 0 0 1


If I use
cut
on
v
directly, then I do not get the same levels when I run
cut
on different vectors, because the range of values — their minimum and maximum — is going to be different between arbitrary vectors, and so while I may have the same number of breaks, the level intervals will not be the same.

My goal is to take different vectors and bin them to the same set of levels. Hopefully this helps clarify my question. Thanks for any assistance.

Answer Source
frac = seq(0,1,by=0.1)

ranges = paste(head(frac,-1), frac[-1], sep=" - ")
freq   = hist(v, breaks=frac, include.lowest=TRUE, plot=FALSE)

data.frame(range = ranges, frequency = freq$counts)
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download