ChadH ChadH - 23 days ago 6
SQL Question

How to select rows with x but not y or z using HAVING in mysql

I am trying to create a query and not sure if I'm using HAVING correctly.
Basically, I have users with certain products. I want to find users that have 'X' product and not ('Y' and/or 'Z'). It's ok for them to have 'X' and/or 'A' and/or 'B', etc. just not 'X' and ('Y' and/or 'Z').

What I have do far but is not removing the users with my criteria.



SELECT *
FROM users_product_version upv
WHERE user_id IN (SELECT upv.user_id
FROM users_product_version upv
JOIN product_version pv ON upv.product_id = pv.id
JOIN product p ON pv.priority_product_id = p.id
WHERE p.id = '89'
GROUP BY upv.user_id)
GROUP BY upv.product_id
HAVING upv.product_id NOT IN ('9562','9563','9564');




Select these users:

1: X, A, B, F

2: X

3: X, G


Don't select these users:

1: X, Y, A

2: Y

3: BLANK

4: X, Y, Z


My
users_product_version
table holds the
product_id
's.
I've also tried
GROUP BY user_id, product_id
. I'm almost certain I'm making this too complicated than it is. Does anyone have any tips? Thank you.

Answer

I would write the query this way:

SELECT u1.*
FROM users_product_version AS u1
LEFT OUTER JOIN users_product_version AS u2
  ON u1.user_id = u2.user_id AND u2.product_id IN ('Y', 'Z')
WHERE u1.product_id = 'X'
  AND u2.user_id IS NULL;

By the WHERE clause, the user must have 'X'.

In the LEFT OUTER JOIN, we try to find another row where that user has either 'Y' or 'Z'. We want there to be no match for such a row, so we have a condition that some column from that table (user_id) must be NULL. This will happen by virtue of the outer join when there is no match.

I think you misunderstand what HAVING is for. HAVING is for applying a condition to a group after you have done GROUP BY.

Writing a condition in the HAVING clause that compares a column to a single scalar value makes no sense. There might be multiple rows in the group, with different values for product_id.

You could use an aggregate function like GROUP_CONCAT() to the group of rows, and then compare that in the HAVING clause:

SELECT *
FROM users_product_version
GROUP BY user_id
HAVING FIND_IN_SET('X', GROUP_CONCAT(product_id))
  AND NOT FIND_IN_SET('Y', GROUP_CONCAT(product_id)) 
  AND NOT FIND_IN_SET('Z', GROUP_CONCAT(product_id));

But this is probably going to be inefficient.


Re your comments, to look up names of the product_id attributes.

You can use parentheses with joins, just like you would with arithmetic, to control the order of evaluation.

Joining:

A LEFT OUTER JOIN (B INNER JOIN C)

Joins B to C to produce a set of rows. Then that becomes the operand of the outer join.

It's important to use parentheses in this example, because

A LEFT OUTER JOIN B INNER JOIN C

would produce a much different result. Supposing the join to C references columns from B, it would require that the columns from B are non-NULL. This would implicitly turn the OUTER JOIN B into an INNER JOIN B, because it wouldn't allow cases where there was no B matching A.

Here's how to apply it to your query:

SELECT u1.*
FROM users_product_version AS u1
LEFT OUTER JOIN (
  users_product_version AS u2 INNER JOIN products AS p 
  ON u2.product_id = p.product_id AND u2.product_id IN ('book', 'food')
) ON u1.user_id = u2.user_id 
WHERE u1.product_id = 'X'
  AND u2.user_id IS NULL;