jmelvin jmelvin - 6 months ago 7
SQL Question

Consolidate 2 queries with similar where clauses but different aggregate function targets

I have 2 queries that work fine separately. Given they are similar, I'd like to consolidate them into one performant query. Seems straightforward as the where clauses are similar. But the sum, count, and min functions all apply to different rows and get in the way.

Context:


  • Users can score (or rate) a location and get points

  • User A can refer User B and get referral points when User B first submits a score

  • Points expire after a certain date

  • Goal is to build a leaderboard of users and their total points for scoring and referring for a particular location (area/country)

  • Positional parameters are filled in with hard values for 'Massachusetts', 'United States', and the scoreDateTime expiration date and are unfortunately duplicated in both select subqueries.



Question:

How can the query below be reorganized to combine constraints? There must be a way to start with a list of scores from a specific location after a certain date. The only complication is to get User B's first score date and only offer referral points to User A if it is after the expiration date.

select scoring.userId, scoring.points + referring.points as leaderPoints
from (
select userId, sum(ratingPoints) as points
from scores s, locations l
where s.locationId = l.locationId and
l.locationArea = 'Massachusetts' and
l.locationCountry = 'United States' and
s.scoreDateTime > '2016-04-16 18:50:53.154' and
s.userId != 0
group by s.userId
) as scoring

join (
select u1.userId, count(*) * 20 as points
from users u0
join users u1 on u0.userId = u1.userId
join users u2 on u2.referredByEmail = u1.emailAddress
join scores s on u2.userId = s.userId
join locations l on s.locationId = l.locationId
where l.locationArea = 'Massachusetts' and
l.locationCountry = 'United States' and
scoreDateTime = (
select min(scoreDateTime)
from scores
where userId = u2.userId
) and
scoreDateTime >= '2016-04-16 18:50:53.154'
group by u1.userId
) as referring on scoring.userId = referring.userId
order by leaderPoints desc
limit 10;

Answer

Thanks Stan Shaw but I was unable to get your query to work on MySQL to test the results. However, I did notice a special case that was not covered by my original query. A user can get refer points from areas in which they themselves have not submitted scores. As long as the new user scores in that area, they get refer points there.

Here is the final query I'm using. I was not able to consolidate the duplicate where clauses in a way that appeared performant.

select userId, sum(points) as leaderPoints
from (

    select   s.userId, sum(s.ratingPoints) as points
    from     scores s, locations l
    where    s.locationId = l.locationId and  
             l.locationArea = 'Georgia' and  
             l.locationCountry = 'United States' and  
             s.scoreDateTime >= '2016-04-05 03:00:00.000' and  
             s.userId != 1
    group by userId

    union

    select   u1.userId, 20 as points
    from     users u0, users u1, users u2, scores s, locations l
    where    u0.userId = u1.userId and
             u2.referredByEmail = u1.emailAddress and
             u2.userId = s.userId and
             s.locationId = l.locationId and
             l.locationArea = 'Georgia' and  
             l.locationCountry = 'United States' and  
             scoreDateTime >= '2016-04-05 03:00:00.000' and
             scoreDateTime = (
                 select min(scoreDateTime) 
                 from   scores  
                 where  userId = u2.userId  
             )

) as pointsEarned
group by userId
order by leaderPoints desc
limit 10
order by leaderPoints desc  
limit 100;
Comments