user3375672 user3375672 - 3 months ago 8
R Question

R: find data frame index of multiple conditions

Given two data frames

s
and
q
with five observations each:

set.seed(8)
s <- data.frame(id=sample(c('Z','X'), 5, T),
t0=sample(1:10, 5, T),
t1 = sample(11:30, 5, T))

q <- data.frame(id=sample(c('Z','X'), 5, T),
t0=sample(1:10, 5, T),
t1 = sample(11:30, 5, T))


> s
id t0 t1
1 Z 8 20
2 Z 3 12
3 X 10 19
4 X 8 21
5 Z 7 13

> q
id t0 t1
1 X 3 30
2 Z 5 12
3 Z 7 23
4 Z 3 21
5 X 7 27


The midpoint for the observations between the variables t0 and t1 is (e.g. for
s
data):

s$t0+(s$t1-s$t0)/2


To find the index of the (first) observation in
s
whose midpoint is closest to, say, the first observation in
q
I can do:

i <- which.min(abs((s$t0+(s$t1-s$t0)/2 - (q$t0[1]+(q$t1[1]-q$t0[1])/2)))
s[i,]


gives:

id t0 t1
3 X 10 19


But I cannot figure out how to find the same index in the original data
s
if I also want to condition on the id variable (e.g. pseudo code like:
which.min(....) & s$id == q$id[1]
- in this case the midpoint is sought among ids being 'X'). This SO is close but not spot on.
Again: I need a index to be used in the original 5-row data set.

Answer

Set the which.min argument to infinity when your condition is not obeyed:

val <- abs((s$t0+(s$t1-s$t0)/2 - (q$t0[1]+(q$t1[1]-q$t0[1])/2))
val[s$id != q$id[1]] <- Inf
i <- which.min(val)

By the way, you can simplify the expression in the first character as:

val <- abs((s$t0+s$t1)/2-(q$t0[1]+q$t1[1])/2)

or even

val <- abs(s$t0+s$t1-q$t0[1]-q$t1[1])/2