John John - 2 months ago 14
SQL Question

PERCENTILE_DISC() in PostgreSQL as a window function

We are porting our system from SQL Server to PostgreSQL. In that we calculate Median Daily Turnovers for all companies on all dates for past 3 months. Below is the simplified query for the same

SELECT B.Company, B.Dt, B.Turnover, (Select distinct
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY Turnover) OVER (PARTITION
BY B.Company, B.Dt) from Example_Tbl AS G where G.Company = B.Company
and G.Dt <= B.Dt and G.Dt > DateAdd(dd, -92, B.Dt)) as
Med_3m_Turnover FROM Example_Tbl B;


The problem is that PostgreSQL doesn't support the use of
percentile_disc()
as a window function. The error message is:


ERROR: OVER is not supported for ordered-set aggregate percentile_disc


Is there any way I can implement the same functionality using something else in PostgreSQL.

edit: Here is example input data in Example_Tbl

Company Dt Turnover
x 1 10
x 2 45
x 3 20
y 1 300
y 2 100
y 3 200


And the ouptut should be as below. Please note, we are ignoring 3 Months right now and just have 3 rows per company

Company Dt Turnover Med_3m_Turnover
x 1 10 10
x 2 45 10 or 45 depending on percentile_desc
x 3 20 20
y 1 300 300
y 2 100 300 or 100 depending on percentile_desc
y 3 50 100

Answer

Your partition by clause (PARTITION BY B.Company, B.Dt) is using values from the outer query (alias B), not the subquery (alias G), which wasn't immediately obvious to me at first. Because the values of B.company and B.Dt are constant for each execution of the subquery, then your partition clause is really no different than simply writing it like this:

... over (partition by 1)

You can test it in SQL Server if you want, but you'll find that the results are the same. Now, I don't know if using B.Company, B.Dt was intentional or not, but in effect, it means that the partition by clause is not actually partitioning anything.

So, as a result, the good news for you is that to write the equivalent query in PostgreSQL, you simply need to omit the OVER (PARTITION BY B.Company, B.Dt) clause entirely, and the behavior will be the same as in SQL Server.