user451151 user451151 - 4 days ago 5
R Question

Apply-family on two lists (to avoid nested for-loops)

Let's say I have the following:

myseq <- seq(0, 1, by = 0.1)
scores <- sample(seq(0, 1, by = 0.01), 10)
var1 <- sample(c(0,1), 10, replace = T)
var2 <- sample(c(0,1), 10, replace = T)
mydf <- data.frame(scores = scores, var1 = var1, var2 = var2)

myseq
[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

mydf
scores var1 var2
1 0.10 1 0
2 0.06 1 0
3 0.74 0 0
4 0.15 1 0
5 0.40 1 1
6 0.96 0 0
7 0.04 1 0
8 0.71 0 1
9 0.94 1 1
10 0.38 0 0


For each value in
myseq
, I want to sum
var1
and
var2
for the subset of records where
scores
is greater than the value in
myseq
.

I want to do this only using the apply-family functions (apply, lapply, tapply, sapply, mapply, etc.). In other words, no nested for-loops.

So, for example:

The first value in
myseq
is
0.0
. All
scores
are greater than
0.0
, so I want to return
var1
=
6
and
var2
=
3
.

The second value in
myseq
is
0.1
. Only 7 of the 10
scores
are greater than
0.1
, so I want to return
var1
=
3
and
var2
=
3
.

...so on and so forth...

In the end, I'd like to the final output to be a 11(r) x 2(c) matrix (or data frame or list) containing the sums for each var.

var1 var2
6 3
3 3
...
...


Note: 11(r) is because the length of
myseq
is 11; 2(c) is because there are two vars,
var1
and
var2

Answer

Something like this ?

res<-t(sapply(myseq,function(x){apply(mydf[scores>x,2:3],2,sum)}))
Comments