Lennie Lennie - 3 months ago 19
R Question

data.table roll=“nearest” overrides search value in returned row

I'm using the data.table package for R. As far I really enjoyed the package and it helps me saving a lot of computing time.

However I got one problem with the binary search function J() and roll="nearest".

Let's say I got this example data.table:

Key Value1 Value2
20 4 5
12 2 1
55 10 7


And now I perform this action on the data.table:

data.table.subset[J(15), roll = "nearest"]


and it returns this:

Key Value1 Value2
15 2 1


So it returns the correct row, but by doing that it overrides the old key-value.

My question is that a normal behaviour and can one change this auto override?

EDIT:

Reproducible Example (Note I use version 1.9.7):

library("data.table")
dt <- data.table(c(20,12,55), c(4,2,10), c(5,1,7))
dt
# V1 V2 V3
#1: 20 4 5
#2: 12 2 1
#3: 55 10 7
setkey(dt, V1)
dt[J(15), roll = "nearest"]
# V1 V2 V3
#1: 15 2 1

Answer

You probably need data.table in 1.9.7 to make x.V1 work. Then you can refer to column from x dataset explicitly. This is required because columns used in join are taken from the second dataset i, as it is in base R.

library("data.table")
dt <- data.table(c(20,12,55), c(4,2,10), c(5,1,7))
setkey(dt, V1)
dt[J(15), .(V1=x.V1, V2, V3), roll = "nearest"]
#   V1 V2 V3
#1: 12  2  1

As you mention you already have 1.9.7, for others who doesn't have see Installation wiki.