danspants danspants - 7 months ago 21
SQL Question

Mysql cumulative total only working sometimes

I'm using the following SQL to calculate the running total by date of a data processing function:

set @running_total := 0;
select date(event_date), (@running_total := @running_total + count(distinct de.iddocument)) AS cumulative_sum , count(distinct de.iddocument)
from document_event as de
left join document as d on d.iddocument = de.iddocument
where d.iddatastream = 142
and de.event_type = 'RESEARCHED'
and de.update_by_id is not null
group by date(event_date);


This SQL has been working perfectly for a year, however on the latest batch of data it no longer calculates the running total, it only displays each days total.

e.g. this is what it does on my older data:

+------------------+----------------+-------+
| date(event_date) | cumulative_sum | count |
+------------------+----------------+-------+
| 2015-11-09 | 167 | 167 |
| 2015-11-10 | 329 | 162 |
| 2015-11-11 | 775 | 446 |
| 2015-11-12 | 1151 | 376 |
| 2015-11-13 | 1680 | 529 |
| 2015-11-16 | 2266 | 586 |
| 2015-11-17 | 2837 | 571 |
| 2015-11-18 | 3590 | 753 |
| 2015-11-19 | 4162 | 572 |
+------------------+----------------+-------+


and this is what it does on my newest data:

+------------------+----------------+-------+
| date(event_date) | cumulative_sum | count |
+------------------+----------------+-------+
| 2016-04-20 | 6 | 6 |
| 2016-04-21 | 91 | 91 |
| 2016-04-22 | 151 | 151 |
| 2016-04-26 | 239 | 239 |
| 2016-04-27 | 203 | 203 |
| 2016-04-28 | 312 | 312 |
| 2016-04-29 | 374 | 374 |
| 2016-05-02 | 368 | 368 |
| 2016-05-03 | 226 | 226 |
+------------------+----------------+-------+


How is it possible that the running total is no longer being calculated?

Any ideas appreciated!

Answer

Cumulative sums and aggregations sometimes don't mix. Try this:

select dte,
       (@running_total := @running_total + cnt) AS cumulative_sum, 
       cnt
from (select date(event_date) as dte, count(distinct de.iddocument) as cnt
      from document_event de left join
           document d
           on d.iddocument = de.iddocument
      where d.iddatastream = 142 and
            de.event_type = 'RESEARCHED' and
            de.update_by_id is not null
      group by date(event_date)
      order by date(event_date)
     ) cross join
     (select @running_total := 0) params;