Luke Schoen Luke Schoen - 1 month ago 11
SQL Question

Unable to calculate difference between CTE subquery outputs for use in larger PostgreSQL query output column

Using PostgreSQL v9.4.5 from Shell I created a database called moments in

psql
by running
create database moments
. I then created a moments table:

CREATE TABLE moments
(
id SERIAL4 PRIMARY KEY,
moment_type BIGINT NOT NULL,
flag BIGINT NOT NULL,
time TIMESTAMP NOT NULL,
UNIQUE(moment_type, time)
);
INSERT INTO moments (moment_type, flag, time) VALUES (1, 7, '2016-10-29 12:00:00');
INSERT INTO moments (moment_type, flag, time) VALUES (1, -30, '2016-10-29 13:00:00');
INSERT INTO moments (moment_type, flag, time) VALUES (3, 5, '2016-10-29 14:00:00');
INSERT INTO moments (moment_type, flag, time) VALUES (2, 9, '2016-10-29 18:00:00');
INSERT INTO moments (moment_type, flag, time) VALUES (2, -20, '2016-10-29 17:00:00');
INSERT INTO moments (moment_type, flag, time) VALUES (3, 10, '2016-10-29 16:00:00');


I run
select * from moments
to view the table:

Moments Table

id | moment_type | flag | time
----+-------------+------+---------------------
1 | 1 | 7 | 2016-10-29 12:00:00
2 | 1 | -30 | 2016-10-29 13:00:00
3 | 3 | 5 | 2016-10-29 14:00:00
4 | 2 | 9 | 2016-10-29 18:00:00
5 | 2 | -20 | 2016-10-29 17:00:00
6 | 3 | 10 | 2016-10-29 16:00:00


I then try to write an SQL query that produces the following output, whereby for each pair of duplicate moment_type values it returns the difference between the flag value of the moment_type having the most recent timestamp value, and the flag value of the second most recent timestamp value, and lists the results in ascending order by moment_type.

Expected SQL Query Output

moment_type | flag |
------------+------+
1 | -37 | (i.e. -30 - 7)
2 | 29 | (i.e. 9 - -20)
3 | 5 | (i.e. 10 - 5)


The SQL query that I came up with is as follows, which uses the WITH query to write multiple Common Table Expressions (CET) subqueries for use as temporary tables in the larger SELECT query at the end. I also use an SQL function to calculate the difference between two of the subquery outputs (alternatively I think I could have just used DIFFERENCE
DIFFERENCE(most_recent_flag, second_most_recent_flag) AS flag
instead of the function):

CREATE FUNCTION difference(most_recent_flag, second_most_recent_flag) RETURNS numeric AS $$
SELECT $1 - $2;
$$ LANGUAGE SQL;

-- get two flags that have the most recent timestamps
WITH two_most_recent_flags AS (
SELECT moments.flag
FROM moments
ORDER BY moments.time DESC
LIMIT 2
),
-- get one flag that has the most recent timestamp
most_recent_flag AS (
SELECT *
FROM two_most_recent_flags
ORDER BY flag DESC
LIMIT 1
),
-- get one flag that has the second most recent timestamp
second_most_recent_flag AS (
SELECT *
FROM two_most_recent_flags
ORDER BY flag ASC
LIMIT 1
)
SELECT DISTINCT ON (moments.moment_type)
moments.moment_type,
difference(most_recent_flag, second_most_recent_flag) AS flag
FROM moments
ORDER BY moment_type ASC
LIMIT 2;


But when I run the above SQL query in PostgreSQL, it returns the following error:

ERROR: column "most_recent_flag" does not exist
LINE 21: difference(most_recent_flag, second_most_recent_flag) AS fla...


Question

What techniques can I use and how may I apply them to overcome this error, and calculate and display the differences in the
flag
column to achieve the Expected SQL Query Output?

Note: Perhaps the Window Function may be used somehow as it performs calculations across table rows

Answer

Use the lag() window function:

select moment_type, difference
from (
    select *, flag- lag(flag) over w difference
    from moments
    window w as (partition by moment_type order by time)
    ) s
where difference is not null
order by moment_type

 moment_type | difference 
-------------+------------
           1 |        -37
           2 |         29
           3 |          5
(3 rows)