MaxPY MaxPY - 3 months ago 8
SQL Question

Efficiently find "isolated" rows in sql

Say we have a table with following schema:

| ID | DATE | VALUE |
| ------------- |:-------------:| -----:|
| 1 | '2016-08-01' | 1600 |
| 9 | '2016-03-03' | 12 |
| 1 | '2016-08-21' | 1 |
| 4 | '2016-09-01' | 1 |
| .... .... .... |


How to efficiently find all rows for which table doesn't contain records with same ID in N days after row's DATE?

Simply put, the goal is to find "last action" for each user (there were no actions from the user for at least N days after)

Answer

You could use the lead window function. This will perform better than using a correlated subquery:

select *
from   ( select id, date, value,
                lead(date) over (partition by id order by date) next_date
         from   mytable ) as detail
where  date < date_sub(next_date, 30) or  next_date is null

This assumes your date field is a timestamp. If it is a string, then use datediff.

Note that the next_date is null part ensures that you also get the most recent user record in the result set, as obviously it has no date following that is too soon after.