kinadian kinadian - 5 months ago 18
MySQL Question

Correlated Subquery with Join

I have 3 tables: alert, user_to_alert, and user_device. The device table holds a list of tokens and and includes a user_id field.

The user_to_alert table holds records of when a user reads an alert. If a record doesn't exist for the appropriate alert_id/user_id combo, it means the user hasn't read the alert. There is also an alert_read column that can be zero (if the user marks the alert as unread after reading it).

I have a query that loads a list of device tokens and I need to expand this to include the number of unread alerts. This is the query I've come up with

SELECT d.*, (
SELECT COUNT(*) FROM `alert` AS a
LEFT JOIN `user_to_alert` AS uta ON
( uta.`alert_id` = a.`alert_id` AND uta.`user_id` = d.`user_id` )
WHERE uta.`alert_read` = 0 OR uta.`alert_read` IS NULL) AS user_badge_number
FROM `user_device` AS d
WHERE d.`device_type` = 'iOS' AND d.`device_active` = 1
GROUP BY d.`device_token`


This query gives me error
#1054 - Unknown column 'd.user_id' in 'on clause'


I've setup an SQLFiddle with the schema and query: http://sqlfiddle.com/#!9/ee653/2

If I run my sub-query separately and substitute d.user_id for a valid user_id, I get the expected results (i.e. 3 for user_id 10). I assume this means the problem is where I'm referencing the column d.user_id. I guess MySQL is trying to perform the subquery first so d.user_id isn't a valid column yet? That's just a guess though.

I've tried moving the
uta.user_id = d.user_id
part of my ON clause to the WHERE clause but that just gives all rows a 0 for the user_badge_number column.

I'm having trouble finding anything via Google that relates to this issue. Everything I find about correlated subqueries seems to be using them in ON or WHERE clauses instead of using them to retrieve extra data.

This is being run via a PHP script, so I know that I could run the sub query by itself within the result loop of the PHP script. Obviously, I'd rather figure out how to do it with one query as that would be more efficient.

Answer

As you noted, its not hard to count the number of alerts that were read. You can subtract that from total alerts to get the number unread:

SELECT d.*, 
    ( SELECT COUNT(*) FROM `alert` ) - 
    ( SELECT COUNT(*) FROM `user_to_alert` 
        WHERE `user_id` = d.`user_id` AND `alert_read` = 1 )
FROM `user_device` AS d 
WHERE d.`device_active` AND d.`device_type` = 'iOS';

But note there is a problem with the design though, which is that if a device is added 6 months from now, all the alerts from the past will show up as unread on that new device. There are similar problems if a device gets reassigned to a new user, or an existing user switches devices (depending on whether you are counting unread alerts per user or per device).

To get around this you could insert rows into user_to_alert to mark all the old alerts read, every time a device is added. But I think the better answer is to insert rows into user_to_alert as each alert is created. This would have several advantages:

  • easy to count the number of unread rows for a user or device
  • new devices/users don't inherit old alerts
  • lets you add a new feature: alerts that are directed to only some users/devices
Comments