Kyle Morgan Kyle Morgan - 4 months ago 6
SQL Question

Complex Grouping in SQL Query

I have a fairly complicated SQL query that I need to build. Forgive me, I'm not a magical wizard with SQL.

Here's my two tables (significantly simplified):

tableA

id request_id page_views step
-----------------------------------
1 1 0 0
2 1 0 1
3 1 0 2
4 1 0 3
5 2 0 0
6 2 0 1
7 2 1 2
8 3 0 0
9 3 0 1
10 4 0 0
11 4 0 1
12 4 0 2


tableB

id name phone
------------------------------
1 John Deere 111-222-3333
2 Sally Sue 333-222-1111
3 Jacob Clark 434-343-4343
4 Alex Smith 222-112-2112


First, there needs to be a join on the tables where
tableA.request_id = tableB.id
to result in:

id request_id page_views step name phone
----------------------------------------------------------------
1 1 0 0 John Deere 111-222-3333
2 1 0 1 John Deere 111-222-3333
3 1 0 2 John Deere 111-222-3333
4 1 0 3 John Deere 111-222-3333
5 2 0 0 Sally Sue 333-222-1111
6 2 0 1 Sally Sue 333-222-1111
7 2 1 2 Sally Sue 333-222-1111
8 3 0 0 Jacob Clark 434-343-4343
9 3 0 1 Jacob Clark 434-343-4343
10 4 0 0 Alex Smith 222-112-2112
11 4 0 1 Alex Smith 222-112-2112
12 4 0 2 Alex Smith 222-112-2112


From that table, I want groups to be returned if they match the conditions below. By group, I mean groups of rows that have the same
request_id
. Here's the conditions:


  1. Of the rows in a group, none have
    page_views
    greater than 0.

  2. Of the rows in a group, none have
    step
    greater than 2.



If either of the two above conditions fail, the entire group won't be returned. So here's what should be returned:

id request_id page_views step name phone
----------------------------------------------------------------
8 3 0 0 Jacob Clark 434-343-434
9 3 0 1 Jacob Clark 434-343-434
10 4 0 0 Alex Smith 222-112-2112
11 4 0 1 Alex Smith 222-112-2112
12 4 0 2 Alex Smith 222-112-2112


The "3" group (or Jacob Clark) didn't have any rows where
page_views
is greater than 0, and none of the rows have
step
greater than 2. Same with the "4" group (or Alex Smith).

So that's the problem. I need a single SQL query that will handle all of that. The first join can be a subquery, no problem:

SELECT sub.*
FROM (
SELECT tableA.*, tableB.name, tableB.phone
FROM `tableA`, `tableB`
WHERE tableA.`request_id` = tableB.id
) sub


After that, I'm not so sure.

Any help would be appreciated.

Answer

You need to calculate the max views and steps (by request_id) separately in a subquery, and then use the request_id's with small enough max values:

SELECT a.*, b.name, b.phone
FROM (
   SELECT request_id
   FROM tableA
   GROUP BY request_id
   HAVING MAX(page_views) <= 0 AND MAX(step) <= 2
) AS sumQ
INNER JOIN tableA AS a ON sumQ.request_id = a.request_id
INNER JOIN tableB AS b ON a.request_id = b.id
;

or, alternatively:

SELECT a.*, b.name, b.phone
FROM tableA AS a
INNER JOIN tableB AS b ON a.request_id = b.id
WHERE a.request_id IN (
     SELECT request_id
     FROM tableA
     GROUP BY request_id
     HAVING MAX(page_views) <= 0 AND MAX(step) <= 2
  )
;

In my experience, the first version is usually faster.

Comments