Richard Richard - 5 months ago 30
SQL Question

SQL: two queries for different time periods, and calculate percentage change?

I am using BigQuery and have a table with three columsn:

district
(string),
price
(float) and
timestamp
). My table looks like this:

district price date
OOX 200 2015-01-01
00Y 213 2015-01-01
00X 215 2015-01-01


I want to calculate the mean price by district in 2005, the mean price by district in 2015, and the percentage difference between them. In other words, I'd like output that looks like this:

district price_2005 price_2015 percent_change
00X 125 205 0.64
00Y 116 200 0.72


The format of the
percent_change
column isn't critical - it could also be a percentage number, etc.

How can I use BigQuery to do this? I've got this far:

SELECT district, AVG(price) AS price
FROM mytable
WHERE date BETWEEN TIMESTAMP('2005-01-01') AND TIMESTAMP('2015-12-31')
GROUP BY district


But I don't know how to get the remaining two columns without doing separate queries. Do I need a subquery?

Answer

You can do this using conditional aggregation. Using the legacy SQL interface:

select district,
       avg(case when year(date) = 2005 then price end) as price_2005,
       avg(case when year(date) = 2015 then price end) as price_2015,
       ((avg(case when year(date) = 2015 then price end) /
         avg(case when year(date) = 2005 then price end)
        ) - 1) as change
from t
group by district;

The idea is the same for the updated SQL interface, but the function for extracting the year is different.