Brian Greenwood Brian Greenwood - 5 days ago 6
MySQL Question

Simplify sum of sum queries in mysql

I have multiple tables that follow a reimbursement request from its costing phase to its invoicing phase, recording line items and their values at each stage of the request.

To get the totals for each expense category at each phase, I summed them using a select - what I have not been able to figure out is an efficient way to sum their grand totals for the entire phase without having to doing another sum request. Since this queries hundreds of thousands of rows - the response time is suffering. The current method takes just under 4.59 seconds - while when I remove the grand total sums it takes 4.02 seconds - any advice on how to alias the original sums to get a grand total would be most welcome:

SELECT p.program_name, p.id as program_id, p.entity_id, a.id as account_id,


(SELECT COALESCE(SUM(amount),0) FROM cost_items c WHERE expense_category_id = 9 and c.account_id = a.id) as NIPLoanCost,
(SELECT COALESCE(SUM(amount),0) FROM cost_items c WHERE expense_category_id = 2 and c.account_id = a.id) as AcquisitionCost,
(SELECT COALESCE(SUM(amount),0) FROM cost_items c WHERE expense_category_id = 3 and c.account_id = a.id) as PreDemoCost,
(SELECT COALESCE(SUM(amount),0) FROM cost_items c WHERE expense_category_id = 4 and c.account_id = a.id) as DemolitionCost,
(SELECT COALESCE(SUM(amount),0) FROM cost_items c WHERE expense_category_id = 5 and c.account_id = a.id) as GreeningCost,
(SELECT COALESCE(SUM(amount),0) FROM cost_items c WHERE expense_category_id = 6 and c.account_id = a.id) as MaintenanceCost,
(SELECT COALESCE(SUM(amount),0) FROM cost_items c WHERE expense_category_id = 7 and c.account_id = a.id) as AdministrationCost,
(SELECT COALESCE(SUM(amount),0) FROM cost_items c WHERE expense_category_id = 8 and c.account_id = a.id) as OtherCost,

(SELECT COALESCE(SUM(amount),0) FROM cost_items c WHERE c.account_id = a.id)
as TotalCost,

/* I have removed the other 4 table's requests that follow this same methodology */

FROM programs p, accounts a WHERE p.entity_id = a.entity_id;

Answer

You can use conditional aggregation with a join to the cost_items table to replace all the subqueries. Something like this should work:

SELECT p.program_name,
       p.id as program_id,
       p.entity_id,
       t.*
FROM programs p
INNER JOIN accounts a
    ON p.entity_id = a.entity_id
INNER JOIN
(
    SELECT a.id AS account_id,
           SUM(CASE WHEN c.expense_category_id = 9 THEN c.amount ELSE 0 END) AS NIPLoanCost,
           SUM(CASE WHEN c.expense_category_id = 2 THEN c.amount ELSE 0 END) AS AcquisitionCost,
           SUM(CASE WHEN c.expense_category_id = 3 THEN c.amount ELSE 0 END) AS PreDemoCost,
           SUM(CASE WHEN c.expense_category_id = 4 THEN c.amount ELSE 0 END) AS DemolitionCost,
           SUM(CASE WHEN c.expense_category_id = 5 THEN c.amount ELSE 0 END) AS GreeningCost,
           SUM(CASE WHEN c.expense_category_id = 6 THEN c.amount ELSE 0 END) AS MaintenanceCost,
           SUM(CASE WHEN c.expense_category_id = 7 THEN c.amount ELSE 0 END) AS AdministrationCost,
           SUM(CASE WHEN c.expense_category_id = 8 THEN c.amount ELSE 0 END) AS OtherCost,
           COALESECE(SUM(c.amount), 0) AS TotalCost
    FROM accounts a
    LEFT JOIN cost_items c
        ON a.id = c.account_id
    GROUP BY a.id
) t
    ON a.id = t.account_id

Note that the flaw with your original approach was correlated subqueries like this one:

(SELECT COALESCE(SUM(amount),0) FROM cost_items c WHERE expense_category_id = 9 and c.account_id = a.id)

For every record in the joined product of the programs and accounts table you were issuing a separate query (many of them actually) which was scanning the entire cost_items table to determine the sum. In my approach, we make a single pass through the join of accounts and cost_items, and this is where the time is saved.

Comments