mountainclimber mountainclimber - 6 months ago 16
SQL Question

Add percentile (or quartile) calculation to existing SQL query

I have the following query:

SELECT "Cc EV PS" AS factor,
"GICS Sector/" & t1.[GICS Sector],
#8/14/2015# AS calcdate,
(Sum((t1.[Cc EV PS]-z.TheAvg)^4)/Count(t1.[Cc EV PS]))/(Sum((t1.[Cc EV PS]-z.TheAvg)^2)/Count(t1.[Cc EV PS]))^2 AS Kurtosis
FROM tbl_DatedModel_2015_0929_0 AS t1
INNER JOIN
(SELECT t2.[GICS Sector], Avg(t2.[Cc EV PS]) AS TheAvg
FROM tbl_DatedModel_2015_0929_0 AS t2
GROUP BY t2.[GICS Sector]) AS z ON t1.[GICS Sector] = z.[GICS Sector]
GROUP BY t1.[GICS Sector]
HAVING Count(t1.[Cc EV PS]) > 0;


That I am attempting to add 25Percentile to the SQL above. Here is the SQL for 25Prercentile:

SELECT TOP 1 0.75*( SELECT Max(tp2.[Cc EV PS])
FROM tbl_DatedModel_2015_0929_0 AS tp2
WHERE tp2.[Cc EV PS] IN
(SELECT TOP 25 PERCENT tp3.[Cc EV PS]
FROM tbl_DatedModel_2015_0929_0 AS tp3
WHERE tp3.[Cc EV PS] Is Not Null
ORDER BY tp3.[Cc EV PS])) + 0.25*
(SELECT Min(tp4.[Cc EV PS])
FROM tbl_DatedModel_2015_0929_0 AS tp4
WHERE tp4.[Cc EV PS] IN
(SELECT TOP 75 PERCENT tp5.[Cc EV PS]
FROM tbl_DatedModel_2015_0929_0 AS tp5
WHERE tp5.[Cc EV PS] Is Not Null
ORDER BY tp5.[Cc EV PS] DESC)) AS 25Percentile
FROM tbl_DatedModel_2015_0929_0 AS tp1;


In the end the 25Percentile should produce a percentile calculation for subset created in the first query similar to the way Kurtosis is calculated in the first query (i.e. on a subset of the data). The SQL I provided above for 25Percentile just calculates on all of the data.

This has to work in MS Access 2013. I have a VBA answer that is super slow, and thus would prefer pure SQL answers.

The first query works exactly how I want it to to work, except I need to include a 25Percentile calc.

EDIT:
In case you are confused, the first query is used in this way, but I didn't include all of that because I thought it would just distract:

SELECT "Cc EV PS" AS factor,
"GICS Sector/" & t1.[GICS Sector],
#8/14/2015# AS calcdate,
(Sum((t1.[Cc EV PS]-z.TheAvg)^4)/Count(t1.[Cc EV PS]))/(Sum((t1.[Cc EV PS]-z.TheAvg)^2)/Count(t1.[Cc EV PS]))^2 AS Kurtosis
FROM tbl_DatedModel_2015_0929_0 AS t1
INNER JOIN
(SELECT t2.[GICS Sector], Avg(t2.[Cc EV PS]) AS TheAvg
FROM tbl_DatedModel_2015_0929_0 AS t2
GROUP BY t2.[GICS Sector]) AS z ON t1.[GICS Sector] = z.[GICS Sector]
GROUP BY t1.[GICS Sector]
HAVING Count(t1.[Cc EV PS]) > 0
UNION ALL
SELECT "USD Market Cap" AS factor,
"GICS Sector/" & t1.[GICS Sector],
#8/14/2015# AS calcdate,
(Sum((t1.[USD Market Cap]-z.TheAvg)^4)/Count(t1.[USD Market Cap]))/(Sum((t1.[USD Market Cap]-z.TheAvg)^2)/Count(t1.[USD Market Cap]))^2 AS Kurtosis
FROM tbl_DatedModel_2015_0929_0 AS t1
INNER JOIN
(SELECT t2.[GICS Sector], Avg(t2.[USD Market Cap]) AS TheAvg
FROM tbl_DatedModel_2015_0929_0 AS t2
GROUP BY t2.[GICS Sector]) AS z ON t1.[GICS Sector] = z.[GICS Sector]
GROUP BY t1.[GICS Sector]
HAVING Count(t1.[USD Market Cap]) > 0
UNION ALL SELECT "IU Mkt Cap" AS factor,
"GICS Sector/" & t1.[GICS Sector],
#8/14/2015# AS calcdate,
(Sum((t1.[IU Mkt Cap]-z.TheAvg)^4)/Count(t1.[IU Mkt Cap]))/(Sum((t1.[IU Mkt Cap]-z.TheAvg)^2)/Count(t1.[IU Mkt Cap]))^2 AS Kurtosis
FROM tbl_DatedModel_2015_0929_0 AS t1
INNER JOIN
(SELECT t2.[GICS Sector], Avg(t2.[IU Mkt Cap]) AS TheAvg
FROM tbl_DatedModel_2015_0929_0 AS t2
GROUP BY t2.[GICS Sector]) AS z ON t1.[GICS Sector] = z.[GICS Sector]
GROUP BY t1.[GICS Sector]
HAVING Count(t1.[IU Mkt Cap]) > 0;


EDIT: This includes the 25Percentile (1st quartile), but it isn't aggregating on the GICS Sector like Kurtosis is, which produces the same value for all GICS Sectors. It should produce one answer for reach subset (GICS Sector):

SELECT "Cc EV PS" AS factor,
"GICS Sector/" & t1.[GICS Sector],
#8/14/2015# AS calcdate,
(Sum((t1.[Cc EV PS]-z.TheAvg)^4)/Count(t1.[Cc EV PS]))/(Sum((t1.[Cc EV PS]-z.TheAvg)^2)/Count(t1.[Cc EV PS]))^2 AS Kurtosis,
0.75*(SELECT Max(tp2.[Cc EV PS])
FROM tbl_DatedModel_2015_0929_0 AS tp2
WHERE tp2.[Cc EV PS] IN
(SELECT TOP 25 PERCENT tp3.[Cc EV PS]
FROM tbl_DatedModel_2015_0929_0 AS tp3
WHERE tp3.[Cc EV PS] Is Not Null
ORDER BY tp3.[Cc EV PS])) + 0.25*
(SELECT Min(tp4.[Cc EV PS])
FROM tbl_DatedModel_2015_0929_0 AS tp4
WHERE tp4.[Cc EV PS] IN
(SELECT TOP 75 PERCENT tp5.[Cc EV PS]
FROM tbl_DatedModel_2015_0929_0 AS tp5
WHERE tp5.[Cc EV PS] Is Not Null
ORDER BY tp5.[Cc EV PS] DESC)) AS 25Percentile
FROM tbl_DatedModel_2015_0929_0 AS t1
INNER JOIN
(SELECT t2.[GICS Sector], Avg(t2.[Cc EV PS]) AS TheAvg
FROM tbl_DatedModel_2015_0929_0 AS t2
GROUP BY t2.[GICS Sector]) AS z ON t1.[GICS Sector] = z.[GICS Sector]
GROUP BY t1.[GICS Sector]
HAVING Count(t1.[Cc EV PS]) > 0


EDIT (fastest!): This will get the quartile/percentile and Kurtosis twice as fast as (I figured it after I received the correct answer for my previous method):

SELECT "Cc EV PS" AS factor, "GICS Sector/" & t1.[GICS Sector] AS Expr1,
#8/14/2015# AS calcdate,
(Sum((t1.[Cc EV PS]-z.TheAvg)^4)/Count(t1.[Cc EV PS]))/(Sum((t1.[Cc EV PS]-z.TheAvg)^2)/Count(t1.[Cc EV PS]))^2 AS Kurtosis,
(select max([Cc EV PS]) from tbl_DatedModel_2015_0929_0 where tbl_DatedModel_2015_0929_0.[Cc EV PS] in
(select top 25 percent [Cc EV PS] from tbl_DatedModel_2015_0929_0
where t1.[GICS Sector] = tbl_DatedModel_2015_0929_0.[GICS Sector] order by [Cc EV PS])) AS 25Percentile
FROM tbl_DatedModel_2015_0929_0 AS t1 INNER JOIN (SELECT t2.[GICS Sector], Avg(t2.[Cc EV PS]) AS TheAvg
FROM tbl_DatedModel_2015_0929_0 AS t2
GROUP BY t2.[GICS Sector]) AS z ON t1.[GICS Sector] = z.[GICS Sector]
GROUP BY t1.[GICS Sector]
HAVING (((Count(t1.[Cc EV PS]))>0));

Answer

I am not sure about MS Access, but those replacements should do the trick:

WHERE tp5.[Cc EV PS] Is Not Null ->
WHERE tp5.[Cc EV PS] Is Not Null and tp5.[GICS Sector] = t1.[GICS Sector]


WHERE tp3.[Cc EV PS] Is Not Null ->
WHERE tp3.[Cc EV PS] Is Not Null and tp3.[GICS Sector] = t1.[GICS Sector]
Comments