hitbid hitbid - 6 months ago 16
SQL Question

Trying to add the overall Average into this SQL query.

I have two valid queries, and still trying to understand subqueries better, but hitting a wall right now.

The first code works fine, and does what I want. I only get the plays that are 1 -4 down plays, and I can get the average yards to go with AVG(togo).

Select OffenseTeam, Avg(Togo) as YdsToGo,Down as Downtogo, Down = Case
when Down =1 Then Count(Down)
when Down =2 Then Count(Down)
when Down =3 Then Count(Down)
when Down =4 Then Count(Down) End
From temp_NFL2015
WHere ToGo <> 0
Group By OffenseTeam, Down
Order By OffenseTeam ASC, Downtogo ASC


Next query gives me the OVERALL average of the entire league, four clean lines are the result of this query.

Select Down, Avg(Togo) as YTG
From Temp_NFL2015
Where Togo<>0
Group By Down
Order By Down ASC


Now when I try to combine them, I am failing. I'm following along with book examples, and this seems to be the exact format used, but it fails

Select OffenseTeam, Avg(Togo) as YTG,Down as Downtogo, Down = Case
when Down =1 Then Count(Down)
when Down =2 Then Count(Down)
when Down =3 Then Count(Down)
when Down =4 Then Count(Down) End
(Select Down, Avg(Togo) as YTG
From Temp_NFL2015
Where Togo<>0
Group By Down
Order By Down ASC) as LGAvg
From temp_NFL2015
WHere ToGo <> 0
Group By OffenseTeam, Down
Order By OffenseTeam ASC, Downtogo ASC

Answer

Right now you're selecting two columns in your subquery and trying to put them into one column (Down, Avg(Togo) -> LGAvg)

While I would like to suggest a JOIN, you have a couple options here that will still let you use a subquery since you're learning those.

One, you can use a temp table and join:

 Select   Down, Avg(Togo) as YTG
 Into     #Temp
 From     Temp_NFL2015
 Where    Togo<>0
 Group By Down

Then the main query:

Select OffenseTeam, Avg(Togo) as YTG,Down as Downtogo, Down = Case 
when Down =1 Then Count(Down)
when Down =2 Then Count(Down)
when Down =3 Then Count(Down)
when Down =4 Then Count(Down) End,
      (Select YTG
      From #TEMP t
      Where t.down = nfl.down) as LGAvg
 From temp_NFL2015 nfl
 Where ToGo <> 0
 Group By OffenseTeam, Down
 Order By OffenseTeam ASC, Downtogo ASC

Two, you can use a nested select in your subquery:

(SELECT YTG 
 FROM (Select Down, Avg(Togo) as YTG
       From Temp_NFL2015
       Where Togo<>0
       Group By Down) avg 
 WHERE avg.Down = nfl.Down) as LGAvg

Whole query:

 Select OffenseTeam, Avg(Togo) as YTG,Down as Downtogo, Down = Case 
    when Down =1 Then Count(Down)
    when Down =2 Then Count(Down)
    when Down =3 Then Count(Down)
    when Down =4 Then Count(Down) End,
   (SELECT YTG 
    FROM (Select Down, Avg(Togo) as YTG
          From Temp_NFL2015
          Where Togo<>0
          Group By Down) avg 
    WHERE avg.Down = nfl.Down) as LGAvg
 From temp_NFL2015 nfl
 Where ToGo <> 0
 Group By OffenseTeam, Down
 Order By OffenseTeam ASC, Downtogo ASC

And then there is option 3, which is to put the nested select logic from option 2 into a JOIN, and simply select LGAvg in your main query. Looks like Tudor has you covered there.

EDIT: Additional explanation

Let's take the output of your first two queries (simplified) and analyze them.

Your first query outputs something like this:

Team  |  Down  |  Count
  A        1        18
  A        2        15
  A        3        13
  A        4        11
  B        1        19
  B        2        16
  B        3        13
  B        4        10

Your second outputs this:

Down  |  LGAvg
  1       18
  2       13
  3       11
  4        9

You want to add the LGAvg to the first query. Since we want to use a subquery, we can start off by simply copy-pasting query 2 into the SELECT statement from query 1. This gets us to where you were:

Select OffenseTeam, Avg(Togo) as YTG,Down as Downtogo, Down = Case 
when Down =1 Then Count(Down)
when Down =2 Then Count(Down)
when Down =3 Then Count(Down)
when Down =4 Then Count(Down) End,
      (Select Down, Avg(Togo) as YTG
      From Temp_NFL2015
      Where Togo<>0
      Group By Down
      Order By Down ASC) as LGAvg
 From temp_NFL2015
 WHere ToGo <> 0
 Group By OffenseTeam, Down
 Order By OffenseTeam ASC, Downtogo ASC

But now we have a few problems:

  • We are selecting 2 columns inside our subquery (Down,YTG). We can't do this, as our subquery result is represented by a single column in the outer query (LGAvg)
  • We still have our ORDER BY in the subquery, which isn't allowed and won't accomplish anything anyway
  • We need to provide a correlation that will define how we want the inner query to connect to the outer query. This is the same concept as the ON clause in a JOIN

First, let's remove the ORDER BY:

(Select Down, Avg(Togo) as YTG
From Temp_NFL2015
Where Togo<>0
Group By Down) as LGAvg

Next, lets get down to selecting one column, YTG. We will have to make our current query a derived table, in order to then choose only YTG:

(SELECT YTG FROM ( --Begin derived table
(Select Down, Avg(Togo) as YTG
From Temp_NFL2015
Where Togo<>0
Group By Down)) avgY --End derived table with an alias

Quick note here: Using avg as the alias is bad practice and I shouldn't have done it, for the exact reason as why it confused you. It looks like the AVG() function, and makes the query harder to understand to a reader. I changed it above to avgY for 'average yards', but you can alias it however you would like.

Now we're only missing our correlation. We need to add WHERE avgY.Down = nfl.down since that's how we want the rows to match up. We make this correlation outside of the derived table, which leaves us with this:

(SELECT YTG 
 FROM (Select Down, Avg(Togo) as YTG
       From Temp_NFL2015
       Where Togo<>0
       Group By Down) avgY
 WHERE avgY.Down = nfl.Down) as LGAvg --avgY and nfl are aliases for inner and outer queries

Hopefully that clears it up. Keep in mind that if we put the guts of this into a temp table (see top of this answer) then we could simply reference the temp table, which is much easier to understand:

  (Select YTG
  From #TEMP avgY
  Where avgY.down = nfl.down) as LGAvg

All we're doing instead is including the guts of the subquery in the main query. To prove this, check out what happens if you copy paste the guts over the word #TEMP above!

Guts:

(Select Down, Avg(Togo) as YTG
 From Temp_NFL2015
 Where Togo<>0
 Group By Down)