Dinesh B Dinesh B - 4 months ago 9
SQL Question

Need to get the minimum start date and maximum end date, when there is no break in months

i have 8 rows as shown below,

Column1 Start_date end_date Row_number
1 2014-02-01 2014-02-28 1
1 2014-03-01 2014-03-31 2
1 2014-04-01 2014-04-30 3
1 2014-05-01 2014-05-31 4
1 2014-07-01 2014-07-31 5
1 2015-02-01 2015-02-28 6
1 2015-03-01 2015-03-31 7


I need result like below,

Column1 Start_date end_date
1 2014-02-01 2014-05-31
1 2014-07-01 2014-07-31
1 2015-02-01 2015-03-31


so when the end_date of first row is one day less than the start_date in next row, I need to group all the continuous rows like that and get the result as I shown. I need to do this only via SQL. please let me know, if anyone have any idea to solve this.

In the input record, you can see, first 4 rows are continuous, and 5th row is not continuous and 6th and 7th row is a continuous one.

Thanks in advance.

Answer

The trick here is that you need to first filter out only entries that are the ends of an interval, and then merge them together, rather than trying to keep a running count in one go.

So I don't know what flavour of SQL you're running, and I have no idea what you're trying to signify with Column1, but this should do the trick (written in SQL server flavour, but the only functions you need to adjust are the dateadd and the isnull). The fiddle is here

SELECT DISTINCT 
CASE WHEN Q1.IsStart = 1 
     THEN Q1.start_date
     ELSE LAG(start_date) OVER(ORDER BY Q1.Row_number) END AS start_date,
CASE WHEN Q1.IsEnding = 1 
     THEN Q1.end_date
     ELSE LEAD(end_date) OVER(ORDER BY Q1.Row_number) END AS end_date
FROM
 (SELECT 
  start_date, 
  end_date, 
  Row_number,
  CASE WHEN DATEADD(day,1,end_date) = 
            ISNULL(LEAD(start_date) OVER(ORDER BY Row_number),
                   end_date)
       THEN 0 
       ELSE 1 END AS IsEnding,
  CASE WHEN DATEADD(day,-1,start_date) = 
            ISNULL(LAG(end_date) OVER(ORDER BY Row_number),
                   start_date)
       THEN 0
       ELSE 1 END AS IsStart
  FROM table1) Q1
WHERE Q1.IsEnding = 1 OR Q1.IsStart = 1

For ANSI SQL/For those of you without LAG or LEAD:

SELECT 
  StartDates.start_date, 
  MIN(EndDates.end_date)
FROM
  (SELECT 
    MainEntry.start_date,
    MainEntry.row_number
   FROM
    mytable MainEntry 
    LEFT OUTER JOIN mytable PrevEntry ON PrevEntry.row_number - 1 = MainEntry.row_number
   WHERE
    PrevEntry.end_date IS NULL OR
    EXTRACT(day FROM (MainEntry.start_date - PrevEntry.end_date)) > 1) StartDates
  INNER JOIN
  (SELECT 
    MainEntry.end_date,
    MainEntry.row_number
   FROM
    mytable MainEntry 
    LEFT OUTER JOIN mytable NextEntry ON NextEntry.row_number + 1 = MainEntry.row_number
   WHERE
    NextEntry.start_date IS NULL OR
    EXTRACT(day FROM (NextEntry.start_date - MainEntry.end_date)) > 1) EndDates
  ON StartDates.row_number <= EndDates.row_number
GROUP BY 
  StartDates.start_date

Note that the GROUP BY could contain StartDates.row_number if that takes advantage of an index. Also note that this ANSI solution initially missed the edge cases of rows without any pairs (had INNER JOINs inside the subqueries).

Comments