DanJ - 1 year ago 159
R Question

# How to combine and summarize R data.table rows values from different tables of different sizes?

I have a table of (x,y) points and would like to create a second table that summarizes those points.

I would like each row in the summary table to show the sum of all the y's where x is greater than a sequence of thresholds. But I'm having trouble figuring out how to join the threshold value of the row into the inner sum.

I've gotten this far:

``````samples <- data.table(x=seq(1,100,1), y=seq(1,100,1))
thresholds = seq(10,100,10)
thresholdedSums <- data.table(xThreshold=thresholds, ySumWhereXGreaterThanThreshold=sum(samples[x > xThreshold, y]))

Error in eval(expr, envir, enclos) : object 'xThreshold' not found
``````

How would I accomplish this, or is there a different way to do this sort of thing?

To clarify desired output:

``````thresholdedSums =
[
(row 1) threshold = 10, ySumWhereXGreaterThanThreshold = sum of all y values in samples[] where x > 10,
(row 2) threshold = 20, ySumWhereXGreaterThanThreshold = sum of all y values in samples[] where x > 20,
... etc ...
]
``````

Answer Source

The result can be given by the following code. This solution is not completely based on data.table but works robustly.

``````thresholdedSums <- data.table(
thres = thresholds,
Sum = sapply(thresholds, function(thres) samples[x > thres, sum(y)])
)

#    thres  Sum
# 1:    10 4995
# 2:    20 4840
# 3:    30 4585
# 4:    40 4230
# 5:    50 3775
# 6:    60 3220
# 7:    70 2565
# 8:    80 1810
# 9:    90  955
# 10:   100   0
``````

Additional explanation: `sapply(thresholds, function(thres) samples[x > thres, sum(y)])` returns a vector of the same length as `thresholds`. You can read it as: For every element in `thresholds` execute the function `function(thres) samples[x > thres, sum(y)]` and return the result as a `vector`. In comparison to a `for-loop` this procedure is normally better in performance and easier to read.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download