randombits randombits - 3 months ago 15
MySQL Question

Properly using the HAVING clause in query

I have a query that looks like the following,

SELECT a.day, a.user_id, DATEDIFF(CURDATE(), MAX(b.days_since_login)) as days_since_login
FROM stats a, users b
WHERE a.user_id = b.user_id
AND a.active = 1
GROUP BY
a.user_id
HAVING
a.day = days_since_login


The problem I have is that if the
stats
table has more than one record for the user, zero rows return. If there is exactly one row, it'll return it.

I think my use of
HAVING
is incorrect here. The query planner seems to only filter based on the result set from the WHERE clause. Ideally, I'd have this illegal query:

SELECT a.day, a.user_id, DATEDIFF(CURDATE(), MAX(b.days_since_login)) as days_since_login
FROM stats a, users b
WHERE a.user_id = b.user_id
AND a.active = 1
AND a.day = days_since_login
GROUP BY
a.user_id


But that's not possible.

Answer

The problem is that when you use GROUP BY, any of the columns outside the grouped column will be fetched from an unpredictable row in the group. It won't search the group for rows that match the HAVING clause, because that filtering is done after grouping.

You need to join with a subquery that performs the grouping.

SELECT a.day, a.user_id, b.days_since_login
FROM stats AS a
JOIN (SELECT user_id, DATEDIFF(CURDATE(), MAX(b.days_since_login)) as days_since_login 
      FROM users 
      GROUP BY user_id) AS b
ON a.user_id = b.user_id AND a.day = b.days_since_login
WHERE a.active = 1