bobmarksie bobmarksie - 1 month ago 11
SQL Question

PostgreSQL Group By Sum

I've been scratching my head on this problem in PostgreSQL. I have a table

test
with 2 columns: -
id
and
content
. e.g.

create table test (id integer,
content varchar(1024));

insert into test (id, content) values
(1, 'Lorem Ipsum is simply dummy text of the printing and typesetting industry.'),
(2, 'Lorem Ipsum has been the industrys standard dummy text '),
(3, 'ever since the 1500s, when an unknown printer took a galley of type and scrambled it to'),
(4, 'make a type specimen book.'),
(5, 'It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.'),
(6, 'It was popularised in the 1960s with the release of Letraset sheets containing Lorem '),
(7, 'Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker'),
(8, ' including versions of Lorem Ipsum.');


If I run the following query ...

select id, length(content) as characters from test order by id


... then I get: -

id | characters
---+-----------
1 | 74
2 | 55
3 | 87
4 | 26
5 | 120
6 | 85
7 | 87
8 | 35


What I want to do is group the
id
into rows where the sum of the content goes over a threshold. For example, if that threshold is
100
then the desired result would look like the following: -

ids | characters
----+-----------
1,2 | 129
3,4 | 113
5 | 120
6,7 | 172
8 | 35


NOTE (1): - The query doesn't need to generate a
characters
column - just the
ids
- they are here to communicate that they are all over
100
- except for the last row which is
35
.


NOTE (2): -
ids
could be a comma-delimited string or a PostgreSQL array - the type is less important than the values


Can I use a window function to do this or do I need something more complex like a
lateral join
?

Answer

Using stored functions allows to avoid (sometime) the head-breaking queries.

create or replace function fn_foo(ids out int[], characters out int) returns setof record language plpgsql as $$
declare
  r record;
  threshold int := 100;
begin
  ids := '{}'; characters := 0;
  for r in select id, length(content) as lng from test order by id loop
    characters := characters + r.lng;
    ids := ids || r.id;
    if characters > threshold then
      return next;
      ids := '{}'; characters := 0;
    end if;
  end loop;
  if ids <> '{}' then
    return next;
  end if;
end $$;

select * from fn_foo();

╔═══════╤════════════╗
║  ids  │ characters ║
╠═══════╪════════════╣
║ {1,2} │        129 ║
║ {3,4} │        113 ║
║ {5}   │        120 ║
║ {6,7} │        172 ║
║ {8}   │         35 ║
╚═══════╧════════════╝
(5 rows)