Juan Carlos Oropeza Juan Carlos Oropeza - 4 years ago 68
SQL Question

How I calculate gaps between rows

Now I include some parallelism to my app

ProcessXX
, I'm not sure the data can be process in the right order. So Im working in a query to return the lower and upperbound to pass to
ProcessZZ
.

My table avl_pool has
avl_id
and
has_link
and some other fields and a steady flow of data, when new data arrive they start with
has_link=null
, when
ProcessX
finish with the rows
has_link
have the link value
xxxx
is some number.

Now on the next step I have to process only those rows with links, but I cant skip rows, because order is very important.

In this case I need
ProcessZZ(23561211, 23561219)


rn | avl_id | has_link
1 | 23561211 | xxxx -- start
2 | 23561212 | xxxx
3 | 23561213 | xxxx
4 | 23561214 | xxxx
5 | 23561215 | xxxx
6 | 23561216 | xxxx
7 | 23561217 | xxxx
8 | 23561218 | xxxx
9 | 23561219 | xxxx -- end
10 | 23561220 | null
11 | 23561221 | xxxx
12 | 23561222 | xxxx
13 | 23561223 | xxxx


Currently I have:

-- starting avl_id need to be send to ProcessZZ
SELECT MIN(avl_id) as min_avl_id
FROM avl_db.avl_pool
WHERE NOT has_link IS NULL

-- first avl_id still on hands of ProcessXX ( but can be null )
SELECT MIN(avl_id) as max_avl_id -- here need add a LAG
FROM avl_db.avl_pool
WHERE has_link IS NULL
AND avl_id > (SELECT MIN(avl_id)
FROM avl_db.avl_pool
WHERE NOT has_link IS NULL)

-- In case everyone has_link already the upper limit is the last one on the table.
SELECT MAX(avl_id) as max_avl_id
FROM avl_db.avl_pool


I can put everthing in muliple CTE and return both result, but I think this can be handle like some island, but not sure how.

So the query should looks like

SELECT min_avl_id, min_avl_id
FROM cte


min_avl_id | min_avl_id
23561211 | 23561219

Answer Source

If I understand correctly, you want to assign a sequential number to each block. This number is demarcated by the NULL values in has_link.

If this is the problem, then a cumulative sum solves the problem:

select p.*,
       sum(case when has_link is null then 1 else 0 end) over (order by rn) as grp
from avl_db.avl_pool p;

This actually includes the NULL values in the output. The simplest method is probably then a subquery:

select p.*
from (select p.*,
             sum(case when has_link is null then 1 else 0 end) over (order by rn) as grp
      from avl_db.avl_pool p
     ) p
where has_link is not null;
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download