enaJ enaJ - 4 years ago 185
SQL Question

Find the movies with the highest average rating using SQL max()

This is a question from Stanford online database course exercise.
Find the movie(s) with the highest average rating. Return these movie title(s) and their average rating. Use SQLite.

I've seen solutions kindly suggested by others, e.g,


  1. fetch the row with max values.

  2. get top entries.



But what I hope to understand here is where and why my current solution present here went wrong.




The movie rating table:

rID mID stars ratingDate
201 101 2 2011-01-22
201 101 4 2011-01-27
203 103 2 2011-01-20
203 108 4 2011-01-12
203 108 2 2011-01-30
204 101 3 2011-01-09
205 103 3 2011-01-27
205 104 2 2011-01-22
...


Note: mID represents movie ID, rID represents reviewer ID, stars represent the scores ranked by reviewers.

My first thought is to get the average scores for each movie, using code as following:

Select mID, avg(stars) AS avg_stars
From Rating
Group by mID


The resulting summary table is

mID avg_stars
101 3.0
103 2.5
104 2.5
106 4.5
107 4.0
108 3.3





Then I want to select the max values of the scores column and the assciated mIDs

Select mID, max(avg_stars) AS Max_score
From (
Select mID, avg(stars) AS avg_stars
From Rating
Group by mID) T


I expected to get:

mID Max_score
106 4.5


But instead, I get:

mID Max_score
108 4.5

Answer Source

You seem to use MySQL as a DBMS, which allows non-Standard syntax:

When you return mID without adding it to GROUP BY MySQL returns a single row with the maximum(average) and a random value for mID.

This is a generic Standard SQL solution:

Select mID, avg(stars) AS avg_stars
From Rating
Group by mID
having avg(stars) =
 ( select max(avg_stars) -- find the max of the averages
   from 
     ( Select mID, avg(stars) AS avg_stars
       From Rating
       Group by mID
     ) T
 )

This might be very inefficient, that's why there are several proprietary syntax extensions. Most DBMSes (but not MySQL) support Standard SQL's Windowed Aggregate Functions:

select *
from
 ( 
   Select mID, avg(stars) AS avg_stars,
      max(avg(stars)) as max_avg
   From Rating
   Group by mID
 ) T
where avg_stars = max_avg
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download