user1048676 user1048676 - 4 months ago 12
SQL Question

mySQL is not selecting the values of the row with the MAX date

I have the following data:

SYMBOL | TRADE_DATE | LAST_10_DAYS
----------------------------------
ADI | 2016-01-08 | NULL
ADI | 2016-01-07 | NULL
ADI | 2016-01-06 | NULL
ADI | 2016-01-05 | NULL
ADI | 2016-01-04 | NULL
ADI | 2015-07-06 | 4.5


I then wrote the following query to try and get the value in the LAST_10_DAYS column for the MAX date:

SELECT SYMBOL, MAX(TRADE_DATE) as MAX_DATE, LAST_10_DAYS
FROM FF_HISTORICAL_STOCK_PRICE
WHERE SYMBOL='ADI'
GROUP BY SYMBOL


When I do this I would expect the following output:

SYMBOL | TRADE_DATE | LAST_10_DAYS
----------------------------------
ADI | 2016-01-08 | NULL


However, I get the following output:

SYMBOL | TRADE_DATE | LAST_10_DAYS
----------------------------------
ADI | 2016-01-08 | 4.5


Why would this be happening?

EDIT: This is within a JOIN but that part wasn't working. Here is the full query:

SELECT AR_SQ.SYMBOL, AR_SQ.NAME, AR_SQ.LAST, FF_PS.SECTOR, FF_PS.SECTOR_TAGS, RS.LAST_10_DAYS, RS.YTD_PERF, RS.MaxDate
FROM AR_STOCK_QUOTE AR_SQ
INNER JOIN (
SELECT HP.SYMBOL, max(TRADE_DATE) as MaxDate, HP.LAST_10_DAYS, HP.YTD_PERF
FROM FF_HISTORICAL_STOCK_PRICE HP
GROUP BY HP.SYMBOL
) RS on RS.SYMBOL = AR_SQ.SYMBOL
JOIN FF_PRIMARY_STOCK_TABLE FF_PS on FF_PS.SYMBOL = AR_SQ.SYMBOL
WHERE RS.SYMBOL = 'ADI'

Answer

Aggregate functions don't work how you are thinking they work. MAX() doesn't select the row where the max value occurs. It returns the greatest value from the named column. This value might occur on every row!

Consider the following query:

SELECT SYMBOL, 
  MAX(TRADE_DATE) as MAX_DATE, 
  MIN(TRADE_DATE) as MIN_DATE, 
  AVG(TRADE_DATE) as AVG_DATE, 
  LAST_10_DAYS 
FROM FF_HISTORICAL_STOCK_PRICE 
WHERE SYMBOL='ADI' 
GROUP BY SYMBOL

From which row in each group should MySQL take the value of LAST_10_DAYS? The row that has the greatest date? The row that has the lowest date? What if more than one row is tied for the greatest date, i.e. two or more rows have the same date, but different values for LAST_10_DAYS?

What about the AVG_DATE? Perhaps the average date doesn't occur on any row in the table. What then do you expect to be the value of LAST_10_DAYS?

What you are trying to get is not only the max value, but the row where that value occurs.

SELECT AR_SQ.SYMBOL, AR_SQ.NAME, AR_SQ.LAST, FF_PS.SECTOR,
  FF_PS.SECTOR_TAGS, RS.LAST_10_DAYS, RS.YTD_PERF, RS.TRADE_DATE 
FROM AR_STOCK_QUOTE AR_SQ 
INNER JOIN FF_HISTORICAL_STOCK_PRICE RS
  ON RS.SYMBOL = AR_SQ.SYMBOL
INNER JOIN (
  SYMBOL, MAX(TRADE_DATE) AS TRADE_DATE
  FROM FF_HISTORICAL_STOCK_PRICE
  GROUP BY SYMBOL 
) MAXRS 
  ON RS.SYMBOL = MAXRS.SYMBOL AND RS.TRADE_DATE = MAXRS.TRADE_DATE
JOIN FF_PRIMARY_STOCK_TABLE FF_PS on FF_PS.SYMBOL = AR_SQ.SYMBOL 
WHERE RS.SYMBOL = 'ADI'

The subquery returns the symbols, and the max trade date for each symbol. Then you need to join that value to the stock price table again, to find the row where that date occurs. Then you can get other columns from that row.