Tony Tony - 2 months ago 7
SQL Question

Select n first elements such that accumulated sum reaches a given value

Starting from the PostgreSQL query

SELECT filename FROM files_storage ORDER BY date;


I would like to reduce the output table to the first n rows such that the accumulated sum from row 1 to n of a column called
size
reaches at least
max_value
.

Example:

date filename size
2016-09-01 /a/aaa/ 20
2016-09-02 /a/bbb/ 70
2016-09-03 /a/ccc/ 20
2016-09-04 /a/ddd/ 30
2016-09-05 /a/eee/ 50


If
max_value
is 100, I want to return the first three rows because
20 + 70 + 20 >= 100
.

I have seen answers here to similar questions, but nothing in PostgreSQL.

Answer

Use the cumulative sum functionality:

SELECT fs.*
FROM (SELECT fs.*, SUM(size) OVER (ORDER BY date) as running_sum
      FROM files_storage
     ) fs
WHERE running_sum >= 100 AND running_sum - size < 100;

Oh, that gets the first row that crosses the boundary.

You want all of them, so instead:

SELECT fs.*
FROM (SELECT fs.*, SUM(size) OVER (ORDER BY date) as running_sum
      FROM files_storage
     ) fs
WHERE running_sum - size < 100;

If you can have duplicate dates and arbitrarily want one value when duplicate values on the same date might apply:

SELECT fs.*
FROM (SELECT fs.*,
             SUM(size) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as running_sum
      FROM files_storage
     ) fs
WHERE running_sum - size < 100;
Comments