Dick McManus Dick McManus - 1 month ago 12
SQL Question

counting by cases over grouped time

I'm trying to count distinct ID's that appear in my data each week by versiontype and I'm not sure how to properly build the query.

I'm hoping to produce a table along the lines of:

1.1 1.2 1.3 1.4
wk1 1 5 4 8
wk2 4 3 9 8
wk3 1 8 0 6


I tried crafting the query below, but it won't run as it requires the Case statement in the group by, which then won't accept a count().

I'm guessing I am going about this wrong, any suggestions would be appreciated. Thank you.

SELECT
Case when version like "1.1%" then Count(distinct ID)
when version like "1.2%" then Count(distinct ID)
when version like "1.3%" then Count(distinct ID)
when version like "1.4%" then Count(distinct ID) end,
CAST(((datediff(timestamp_pst,'2016-01-03') / 7)+1) as INT) as week_of_the_year
FROM db.table
where timestamp_pst >= "2016-01-28"
group by CAST(((datediff(timestamp_pst,'2016-01-03') / 7)+1) as INT)
order by week_of_the_year

Answer
  SELECT
    COUNT(DISTINCT (CASE WHEN version like '1.1%' THEN ID END)) as '1.1'
    ,COUNT(DISTINCT (CASE WHEN version like '1.2%' THEN ID END)) as '1.2'
    ,COUNT(DISTINCT (CASE WHEN version like '1.3%' THEN ID END)) as '1.3'
    ,COUNT(DISTINCT (CASE WHEN version like '1.4%' THEN ID END)) as '1.4'
  CAST(((datediff(timestamp_pst,'2016-01-03') / 7)+1) as INT) as week_of_the_year
  FROM aws_d3.iaanalytics_detail
  where timestamp_pst >=  "2016-01-28"
  group by CAST(((datediff(timestamp_pst,'2016-01-03') / 7)+1) as INT)  
        order by week_of_the_year

You are wanting to use "Conditional Aggregation". Do do so the case statement actually goes inside the aggregate function. Because you are wanting to COUNT(DISTINCT) you will actually need to do that either by utilizing the DISTINCT key word in the aggregation or by making a derived table so only distinct values are present as another answer suggests but as the only word it will save you from repeating is DISTINCT I don't see the need for complicating the matter by using a derived table.

Note that SUM(CASE WHEN blah THEN 1 ELSE 0 END) will NOT work for you as that would sum all occurrences and not count distinct values. Also Aggregate functions ignore null values and when you don't include an ELSE statement the value of a case expression will be NULL if not matched.

Comments