chipoglesby chipoglesby - 2 months ago 6
SQL Question

How can I make this sub query more efficient?

I have been learning more about making my sql queries more efficient, but I don't know if I'm doing this query properly.

I'm trying to find a set in one group, but not the other group. This is the original query which works:

SELECT
cNumber
FROM
[dataset.table]
WHERE
DATE(date) >= '2016-01-01'
AND cNumber NOT IN (
SELECT
cNumber
FROM
[dataset.table]
WHERE
date BETWEEN DATEDIFF(CURRENT_DATE(),-5, "QUARTER")
AND DATEDIFF(CURRENT_DATE(), -1, "QUARTER"))
GROUP BY
1


When I change my query to the one below, I get a larger number of results:

SELECT
cNumber
FROM
dataset.table
WHERE
DATE(date) >= '2016-01-01'
AND date NOT BETWEEN DATEDIFF(CURRENT_DATE(), -5, "QUARTER")
AND DATEDIFF(CURRENT_DATE(), -1, "QUARTER")
GROUP BY
1


For this type of query, do I need to include a
NOT IN
subquery or should I could I just use the second query?

I think I am confused in the second query by trying to include the
AND orderCreationDate NOT BETWEEN
. I thought that since I'm doing a subquery on the same table, that I would be able to wrap it up into one query.

Answer

for BigQuery (Legacy SQL) try below

SELECT cNumber
FROM [dataset.table]
GROUP BY cNumber
HAVING MAX(DATE(date)) >= '2016-01-01' 
AND SUM(CASE 
    WHEN DATE(date) BETWEEN DATE(DATE_ADD(CURRENT_DATE(), -5, "QUARTER")) 
    AND  DATE(DATE_ADD(CURRENT_DATE(), -1, "QUARTER"))
    THEN 1 ELSE 0
  END) = 0
Comments