Cloud Cloud - 5 months ago 7
SQL Question

Aggregating continuous rows within a SQL table

I am trying to run an aggregate function on the following SQL table to sum up all the "LengthOfRecord" grouped by "Long+Lat" and only rows that are contiguous (i.e. "RowNumber" that is in running sequence).

+-----------+-----------+---------------+----------------+
| RowNumber | Vessel ID | Long+Lat | LengthOfRecord |
+-----------+-----------+---------------+----------------+
| 102313179 | Vessel 01 | 123.751 1.196 | 181 |
| 102313180 | Vessel 01 | 123.751 1.196 | 179 |
| 102313181 | Vessel 01 | 123.751 1.196 | 361 |
| 102313182 | Vessel 01 | 123.751 1.196 | 359 |
| 102313183 | Vessel 01 | 123.751 1.196 | 180 |
| 102313184 | Vessel 01 | 123.751 1.196 | 181 |
| 102313185 | Vessel 01 | 123.751 1.196 | 179 |
| 102313186 | Vessel 01 | 123.751 1.196 | 180 |
| 102313187 | Vessel 01 | 123.751 1.196 | 360 |
| 102313188 | Vessel 01 | 123.751 1.196 | 360 |
| 102313189 | Vessel 01 | 123.751 1.196 | 180 |
| 102313191 | Vessel 01 | 123.751 1.196 | 181 |
| 102313298 | Vessel 01 | 123.750 1.197 | 180 |
| 102313375 | Vessel 01 | 123.742 1.196 | 179 |
| 102313376 | Vessel 01 | 123.742 1.196 | 359 |
| 102313377 | Vessel 01 | 123.742 1.196 | 180 |
| 102313379 | Vessel 01 | 123.742 1.196 | 181 |
| 102313380 | Vessel 01 | 123.742 1.196 | 178 |
+-----------+-----------+---------------+----------------+


The following is the result that I am trying to achieve through SQL statements. Is there anyway that I can do this through an SQL query?

+-----------+---------------+----------------+
| Vessel ID | Long+Lat | LengthOfRecord |
+-----------+---------------+----------------+
| Vessel 01 | 123.751 1.196 | 2881 |
| Vessel 01 | 123.750 1.197 | 180 |
| Vessel 01 | 123.742 1.196 | 1077 |
+-----------+---------------+----------------+

Answer

You can do this using a difference in row numbers approach:

select vesselId, latLong, sum(lengthOfRecord)
from (select t.*,
             row_number() over (partition by vesselId order by rowNumber) as seqnum,
             row_number() over (partition by vesselId, latlong order by rowNumber) as seqnum_latlong
      from table t
     ) t
group by (seqnum  - seqnum_latlong), latLong, vesselId;

The difference of row number approach is a bit tricky to explain. It identifies adjacent rows with the same values. If you run the subquery, you will see how the calculation works.