gerky gerky - 2 months ago 8
SQL Question

How to exclude rows when using a LEFT JOIN (MySQL)

I have users with many posts. I want to build an SQL query that would do the following in 1 query (no subquery), and hopefully no unions if possible. I know I can do this with union but I want to learn if this can be done using only joins.

I want to get a list of distinct active users who:


  1. have no posts

  2. have no active posts



Here's what I have so far:

SELECT DISTINCT u.*
FROM users u
LEFT JOIN posts p
ON p.user_id = u.id
LEFT JOIN posts p2
ON p2.user_id = u.id
WHERE u.status = 'active'
AND (p.status IS NULL
OR p2.status != 'approved');


The problem is when a user has multiple posts and one is active. This will still return the user which I do not want. If a user has an active post, he should be removed from the result set. Any ideas?

Here's what the data looks like:

mysql> select * from users;
+----+---------+
| id | status |
+----+---------+
| 1 | active |
| 2 | pending |
| 3 | pending |
| 4 | active |
| 5 | active |
+----+---------+
5 rows in set (0.00 sec)

mysql> select * from posts;
+----+---------+----------+
| id | user_id | status |
+----+---------+----------+
| 1 | 1 | approved |
| 2 | 1 | pending |
| 3 | 4 | pending |
+----+---------+----------+
3 rows in set (0.00 sec)


The answer here should be only users 4 and 5. 4 doesn't have an approved post and 5 doesn't have a post. It should not include 1, which has an approved post.

Answer

Taking your requirements and translating them literally to SQL, I get this:

SELECT users.id,
       COUNT(posts.id) as posts_count,
       COUNT(approved_posts.id) as approved_posts_count
FROM users
LEFT JOIN posts ON posts.user_id = users.id
LEFT JOIN posts approved_posts
  ON approved_posts.status = 'approved'
  AND approved_posts.user_id = users.id
WHERE users.status = "active"
GROUP BY users.id
HAVING (posts_count = 0 OR approved_posts_count = 0);

For your test data above, this returns:

4|1|0
5|0|0

i.e. users with ids 4 and 5, the first of which has 1 post but no approved posts and the second of which has no posts.

However, it seems to me that this can be simplified since any user that has no approved posts will also have no posts, so the union of conditions is unnecessary.

In that case, the SQL is simply:

SELECT users.id,
       COUNT(approved_posts.id) as approved_posts_count
FROM users
LEFT JOIN posts approved_posts
  ON approved_posts.status = 'approved'
  AND approved_posts.user_id = users.id
WHERE users.status = "active"
GROUP BY users.id
HAVING approved_posts_count = 0;

This also returns the same two users. Am I missing something?