Colin O'Brien Colin O'Brien - 5 months ago 6
SQL Question

T-SQL: How to select rows based on the max date?

I have a table which has two fields which I want to make a decision on:

- DATE
- DATE_LOADED


I want to know if there is a DATE (which may have many associated lines), say '
2016-06-15
', which has two unique DATE_LOADED entries,
'2016-06-16'
and
'2016-06-17'
, then take the MAX DATE_LOADED lines only.

In real terms, two reports have been reporting overlapping data for the same day. The later report is a correction. All columns may have updated but DATE.

ASK:

For each DATE (not just each individual line), check if there is more than one DATE_LOADED, if there is then take all lines associated with that DATE and where DATE_LOADED = MAX DATE_LOADED for that particular date.

Answer

This can be done using window functions

select *
from (
   select t.*, 
          row_number() over (partition by date order by date_loaded desc) as rn
   from the_table t
) x
where rn = 1;

If there are multiple rows with the same max date, the above would only return one of them. If you want all, you can use something like this:

select *
from (
   select t.*, 
          max(date_loaded) over (partition by date) as max_date_loaded
   from the_table t
) x
where date_loaded = max_date_loaded;

If you want to filter out DATEs with just a single DATE_LOADED, you can enhance the query to:

select *
from (
   select t.*, 
          max(date_loaded) over (partition by date) as max_date_loaded,
          count(*) over (partition by date) as cnt
   from the_table t
) x
where date_loaded = max_date_loaded
  and cnt > 1;
Comments