superherogeek superherogeek - 7 months ago 51
SQL Question

How to get average number of days between multiple dates in the same column in MYSQL

I would like to find out the average number of days between orders grouping by account_id in the database.

Let's say I have the following table named 'orders' with this data.

id account_id account_name order_date
1 555 Acme Fireworks 2015-06-15
2 342 Kent Brewery 2015-09-12
3 555 Acme Fireworks 2015-09-15
4 342 Kent Brewery 2015-10-12
5 342 Kent Brewery 2015-11-12
6 342 Kent Brewery 2015-12-12
7 555 Acme Fireworks 2015-12-15
8 900 Plastic Inc. 2015-12-20


I would like a query to produce the following results

account_id account_name average_days_between_orders
342 Kent Brewery 30.333
555 Acme Fireworks 91.5
900 Plastic Inc. (unsure of what value would go here since there's 1 order only)


I checked the following questions to get an idea, but still couldn't figure out the problem:



Thanks!

Answer

You need a query that produces the difference between the previous purchase for a given (null if there is no previous purchase) and take the average of these values.

I would self-join the above table to get for each order the maximum order date of any previous order in a subquery. In the avg() function calculate the difference between the calculated date and the current order date:

SELECT o3.account_id, o3.account_name, avg(diff) as average_days_between_orders
FROM
    (select o1.id,
            o1.account_id,
            o1.account_name,
            datediff(o1.order_date, max(o2.order_date)) as diff
     from orders o1
     left join orders o2 on o1.account_id=o2.account_id and o1.id>o2.id
     group by o1.id, o1.account_id, o1.account_name, o1.order_date) o3
GROUP BY o3.account_id, o3.account_name

As an alternative to joins, you can use a user defined variable in the subquery or a correlated subquery in the select list to calculate the differences. You can check mysql running total solutions to get a hang of this solution, such as this SO topic. Specifically, check out the solution provided by Andomar.

If your orders table is huge, then the alternative aprroaches described in that topic may be better from a performance point of view.