dedpo dedpo - 7 months ago 6
SQL Question

Group by in HIVE not working like i want

Hi so i am trying to output the city with playerid with the most AB(runs).

Output the birth city of the player who had the most at bats (AB) in

his career.

Now i get what i want Cinncinati, sander01, 14432, this is correct. But it shows up in 3's like this. That too for every city and player and runs, like the 2nd most. I only need 1 entry, the other 2 are redundant. I think there something i did wrong with group by, any help? plz

Cinncinati, sander01, 14432
Cinncinati, sander01, 14432
Cinncinati, sander01, 14432
Chicago, dere90, 12324
Chicago, dere90, 12324
Chicago, dere90, 12324



SELECT a.bcity,a.id, b.ab FROM master a
JOIN
(SELECT id, SUM(ab) as ab FROM batting
GROUP by id) b
ON a.id = b.id
ORDER by b.ab DESC
limit 30;

Answer

Refer to DISTINCT for getting distinct result set.Now coming to your question,join master table with the top row from the result set b.

select a.bcity,b.id,b.ab from master a 
join 
(select id,sum(ab) as ab from batting
 group by id
 order by ab desc
 limit 1
) b
on a.id = b.id    

You can change the LIMIT 30 to LIMIT 1 and get the same result.

SELECT a.bcity,a.id, b.ab FROM master a 
JOIN
(SELECT id, SUM(ab) as ab FROM batting
 GROUP by id
) b
ON a.id = b.id
ORDER by b.ab DESC
 limit 1;

Note: if there are multiple players with the same most runs then LIMIT 1 will not give the correct answer.

Comments