I have a table with records for user activity that cover a span indicated by a start and end time. What I'd like to get is a count of users active in the system per unit of time over the course of the previous day.
The maximum session length is an hour and they don't cross hour boundaries.
I've looked at many questions that are very close to this, but none I've found deals with this specific situation.
Here's a stripped down version of the query:
with minutes AS (
-- ignore this...it generates a day's worth of timestamps for each minute
-- it's hairy but is what I'm stuck with on redshift
select (dateadd(minute, -row_number() over (order by true), sysdate::date)) as minute
from seed_table limit 1440
sessions as (
select sid, ts_start, ts_end
from user_sessions s
where ts_end >= sysdate::date-'1 day'::interval
and ts_start < sysdate::date
select m.minute, count(distinct(s.sid))
from minutes m
left join sessions s on s.ts_end >= m.minute and s.ts_start < m.minute+'1 min'::interval
group by 1
-> XN Nested Loop Left Join DS_BCAST_INNER (cost=6913826151.95..4727012848741.55 rows=410434560 width=166)
Join Filter: (("inner".ts_start < ("outer"."minute" + '00:01:00'::interval)) AND ("inner".ts_end >= "outer"."minute"))
You can do this much faster by counting the number of starts and stops in each minute, and then taking the cumulative sum. The result is something like this:
select minute, sum(cnt) over (order by minute) from ((select date_trunc('minute', ts_start) as minute, count(*) as cnt from sessions group by 1 ) union all (select date_trunc('minute', ts_end), - count(*) from sessions group by 1 ) ) s group by minute order by minute;