HelloMundo HelloMundo -4 years ago 106
SQL Question

ActiveRecord Query that sums values in multiple associated tables

I'm using RoR 4.2.4 trying to create an index page where every row shows some buyer info and three values summed from associated tables. I feel like this kind of query must happen all the time, and I'm just missing something really simple.

class Buyer < ActiveRecord::Base
has_many :wins
has_many :pledges
has_many :payments


And wins, pledges and payments belong_to buyer.

This works:

@buyers = Buyer.joins("LEFT OUTER JOIN pledges on buyers.id = pledges.buyer_id")
.where(event_id: @event.id)
.select("buyers.*, sum(pledges.amount) as pledges_total")
.group('buyers.id')
.order('buyers.last_name')


I get a list of Buyers and buyer.pledges_total gives me the sum of their pledges.

But in that same query, I also want to get sum(wins.amount) and sum(payments.amount).

This DOESN'T work:

@buyers = Buyer.joins("LEFT OUTER JOIN wins on buyers.id = wins.buyer_id")
.joins("LEFT OUTER JOIN pledges on buyers.id = pledges.buyer_id")
.joins("LEFT OUTER JOIN payments on buyers.id = payments.buyer_id")
.where(event_id: @event.id)
.select("buyers.*, sum(wins.price) as wins_total, sum(pledges.amount) as pledges_total, sum(payments.amount) as payments_total")
.group('buyers.id')
.order('buyers.last_name')


I get weird incorrect values that make me think I'm summing after the join. But I'm really not sure, and I don't know how else to do it. I'm sure I'll be amazed at how simple this really is.

Thanks for the help.

Answer Source

The SUMs will not work correctly with the multiple JOINs you have. Rather than summing over the entire set of results try moving that operation into sub-queries.

@buyers = Buyer
  .where(event_id: @event.id)
  .select(<<-SELECT)
    buyers.*, 
    (SELECT SUM(wins.price) FROM wins WHERE wins.buyer_id = buyers.id) as wins_total, 
    (SELECT SUM(pledges.amount) FROM pledges WHERE pledges.buyer_id = buyers.id) as pledges_total, 
    (SELECT SUM(payments.amount) FROM payments WHERE payments.buyer_id = buyers.id) as payments_total
  SELECT
  .group('buyers.id')
  .order('buyers.last_name')

Note that the JOINs from your original query are no longer necessary.

Update

Here is an explanation of why your SUMs were incorrect when you had more than one JOIN in your query.

Suppose you have the following data:

# SELECT * FROM buyers;
 id  │ event_id
─────┼───────────
   1 │        1

# SELECT * FROM wins;
 id  │ buyer_id │ price
─────┼──────────┼───────
   2 │        1 │    10
   3 │        1 │    20

# SELECT * FROM pledges;
 id  │ buyer_id │ amount
─────┼──────────┼────────
   4 │        1 │     30
   5 │        1 │     40

SQL join returns the Cartesian product of the given record sets. That means that the results of a join may contain repeated values from individual tuples (rows). In the example below we can see that each tuple from wins and pledges is repeated twice. SQLFiddle

# SELECT buyers.id, wins.id AS wins_id, wins.price AS wins_price, pledges.id AS pledges_id, pledges.amount AS pledges_amount FROM buyers 
# INNER JOIN wins ON wins.buyer_id = buyers.id
# INNER JOIN pledges ON pledges.buyer_id = buyers.id;
 id  │ wins_id │ wins_price │ pledges_id │ pledges_amount
─────┼─────────┼────────────┼────────────┼────────────────
   1 │       2 │         10 │          4 │             30
   1 │       2 │         10 │          5 │             40
   1 │       3 │         20 │          4 │             30
   1 │       3 │         20 │          5 │             40

We can easily look back at the wins and pledges tables and see that the sum of win prices equals 30 and the sum of pledges equals 70. However, if we group by id (buyers.id) and performs the sums then we end up with incorrect values which are double the amount that they should be! SQLFiddle

# SELECT buyers.id, sum(wins.price) AS wins_total, sum(pledges.amount) AS pledges_total FROM buyers 
# INNER JOIN wins ON wins.buyer_id = buyers.id
# INNER JOIN pledges ON pledges.buyer_id = buyers.id
# GROUP BY buyers.id;
 id  │ wins_total │ pledges_total
─────┼────────────┼───────────────
   1 │         60 │           140

You can see that using sub-selects returns the correct results in this SQLFiddle.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download