Noah Noah - 5 months ago 7
SQL Question

Can you pass information between two nested Group BY statements in SQL?

I'm trying to find a way to pass data from a nested GROUP BY statement. Below are my two tables that I have created specifically for this example

TABLE: GROCERIES TABLE: MAP

NAME FRUIT WEIGHT NAME LOCATION
----- ----- ------ ----- --------
AMY APPLE 76 Amy Mondale, CA
BOB APPLE 90 Bob Oakmont, CA
AMY APPLE 86 Chuck Phoenix, AZ
CHUCK APPLE 80 Derrick Taos, NM
CHUCK APPLE 86
AMY ORANGE 125
BOB ORANGE 135
CHUCK ORANGE 115
DERRICK ORANGE 136
DERRICK ORANGE 114
BOB ORANGE 150
DERRICK ORANGE 143
BOB BANANA 119
BOB BANANA 118
CHUCK BANANA 123
CHUCK BANANA 110
CHUCK BANANA 130
AMY BANANA 100


My goal was to find the average weight of each fruit that each person has, and then calculate who has the highest average among each fruit. Below is WORKING code that demonstrates this

SELECT FRUIT, MAX(avg_weight) as WEIGHT
FROM (SELECT NAME, FRUIT, ROUND(AVG(WEIGHT)) AS avg_weight
FROM GROCERIES GROUP BY NAME, FRUIT) tmp
GROUP BY FRUIT


 

FRUIT WEIGHT
--------- ----------
Orange 143
Apple 90
Banana 121


But the problem is that I do not know how to "pass" the Name column to the front so that it can be connected to the Map Table, and thus, print out the location of each person who received the highest amount.

What I Want To Happen

FRUIT WEIGHT NAME LOCATION
--------- ---------- ------- -----------
Orange 143 Bob Oakmont, CA
Apple 90 Bob Oakmont, CA
Banana 121 Chuck Phoenix, AZ


If I call Name in the outer SELECT statement, then I would also have to GROUP BY that same column (so they match) and it would skew my results

SELECT NAME, FRUIT, MAX(avg_weight) as WEIGHT
FROM (SELECT NAME, FRUIT, ROUND(AVG(WEIGHT)) AS avg_weight
FROM GROCERIES GROUP BY NAME, FRUIT)
tmp GROUP BY NAME, FRUIT


NAME FRUIT WEIGHT
-------- --------- ----------
Amy Apple 81
Chuck Orange 115
Amy Orange 125
Derrick Orange 131
Bob Banana 119
Chuck Banana 121
Bob Apple 90
Chuck Apple 83
Bob Orange 143
Amy Banana 100

10 rows selected

Answer

This is probably the most straightforward way to accomplish what you want. Be aware that in the event of ties it would return all people/locations in the tie for first place.

WITH AVG_PER_FRUIT_PER_PERSON AS (
    SELECT NAME, FRUIT, ROUND(AVG(WEIGHT)) AS avg_weight
    FROM GROCERIES
    GROUP BY NAME, FRUIT
), MAX_APFPP_PER_FRUIT AS (
    SELECT FRUIT, MAX(avg_weight) AS max_weight
    FROM AVG_PER_FRUIT_PER_PERSON
    GROUP BY FRUIT
)
SELECT a.FRUIT, a.avg_weight AS WEIGHT, a.NAME, MAP.LOCATION
FROM
    MAX_APFPP_PER_FRUIT m INNER JOIN
    AVG_PER_FRUIT_PER_PERSON a
        ON a.avg_weight = m.max_weight AND a.FRUIT = m.FRUIT INNER JOIN
    MAP
        ON MAP.Name = a.Name;

Using analytic functions you might instead prefer:

WITH AVG_PER_FRUIT_PER_PERSON AS (
    SELECT NAME, FRUIT, ROUND(AVG(WEIGHT)) AS avg_weight
    FROM GROCERIES
    GROUP BY NAME, FRUIT
), RANKED_APFPP_PER_FRUIT AS (
    SELECT *, DENSE_RANK() OVER (PARTITION BY FRUIT ORDER BY avg_weight DESC) as DR
    FROM AVG_PER_FRUIT_PER_PERSON
)
SELECT r.FRUIT, r.avg_weight AS WEIGHT, r.NAME, MAP.LOCATION
FROM RANKED_APFPP_PER_FRUIT r INNER JOIN MAP ON MAP.NAME = r.NAME
WHERE DR = 1;