Tigrann Tigrann - 7 months ago 8
SQL Question

How to display rows happening every week of a year?

I'm trying to parse a logging table in PostgreSQL 9.5. Let's imagine I'm logging SMS sent from all the phones belonging to my company. For each record I have a timestamp and the phone ID.

I want to display how many SMS are sent by week but only for the phones that send SMS each week of the year.

My table is as following:

╔════════════╦══════════╗
║ event_date ║ phone_id ║
╠════════════╬══════════╣
║ 2016-01-05 ║ 1 ║
║ 2016-01-06 ║ 2 ║
║ 2016-01-13 ║ 1 ║
║ 2016-01-14 ║ 1 ║
║ 2016-01-14 ║ 3 ║
║ 2016-01-20 ║ 1 ║
║ 2016-01-21 ║ 1 ║
║ 2016-01-22 ║ 2 ║
╚════════════╩══════════╝


And I would like the following display

╔══════════════╦══════════╦══════════════╗
║ week_of_year ║ phone_id ║ count_events ║
╠══════════════╬══════════╬══════════════╣
║ 2016-01-04 ║ 1 ║ 1 ║
║ 2016-01-11 ║ 1 ║ 2 ║
║ 2016-01-18 ║ 1 ║ 2 ║
╚══════════════╩══════════╩══════════════╝


Only phone_id 1 is displayed because this is the only ID with events in each week of the year.

Right now, I can query to group by week_of_year and phone_IDs. I have the following result:

╔══════════════╦══════════╦══════════════╗
║ week_of_year ║ phone_id ║ count_events ║
╠══════════════╬══════════╬══════════════╣
║ 2016-01-04 ║ 1 ║ 1 ║
║ 2016-01-04 ║ 2 ║ 1 ║
║ 2016-01-11 ║ 1 ║ 2 ║
║ 2016-01-11 ║ 3 ║ 1 ║
║ 2016-01-18 ║ 1 ║ 2 ║
║ 2016-01-18 ║ 2 ║ 1 ║
╚══════════════╩══════════╩══════════════╝


How can I filter in order to only keep phone_ids occurring for each week of year? I tried various subqueries but I must acknowledge I'm stuck. :-)

About the definition of
week_of_year
: as I want to consolidate data per week, I'm using in my select:
date_trunc('week', event_date)::date as interval
. And then I group by
interval
to have the number of SMS per
phone_id
per week.

About the date range, I just want this starting in 2016, I'm using a where condition in my query to ignore everything before:
WHERE event_date > '2016-01-01'


I saw the request to create a SQL Fiddle but I have issues to do so, will do it if I'm not lucky enough to have a good hint to solve this.

Created a quick SQL Fiddle, hope it would useful.

Answer

Your concept of year seems very fuzzy. Let me instead assume that you mean for a period of time over the range of your data.

with w as (
      select date_trunc('week', event_date) as wk, phone_id, count(*) as cnt
      from messages
      group by 1, 2
     ),
     ww as (
      select w.*,
             min(wk) over () as min_wk,
             max(wk) over () as max_wk,
             count(*) over (partition by phone_id) as numweeks
      from w
     )
select ww.wk, ww.phone_id, ww.cnt
from ww
where (max_wk - min_wk) / 7 = cnt - 1;

The first CTE just aggregates the data by week and phone id. The second CTE calculates the first and last week in the data (these could be replaced with constants), as well as the number of weeks for a given phone.

Finally, the where clause makes sure that the number of weeks spans the period of time.