jtheman jtheman - 1 month ago 10
SQL Question

Find aggregated MAX row in SQL Server

Can't seem to figure this out in SQL Server.

Two tables:

statuses:

id proj status changedate
-------------------------------
1 1 1 2015-01-01
2 1 3 2016-01-01
3 1 2 2016-05-01
4 2 1 2016-01-01
... etc


projects:

id title active
----------------------------
1 My project 1
2 No more 0


Now I want to list all active projects with its most recent status id. I have tried various kinds of queries but end up with errors.

Tried this among other solution incl HAVING, but no luck:

SELECT projects.id, projects.title, statuses.status
FROM projects
INNER JOIN statuses ON statuses.proj = projects.id
WHERE projects.active = 1
AND statuses.changedate = MAX(statuses.changedate)


Any SQL Server wizard out there?

Answer

With the path that you are on, you need a subquery:

SELECT p.id, p.title, s.status
FROM projects p INNER JOIN
     statuses s
     ON s.proj = p.id  
WHERE p.active = 1 AND
      s.changedate = (SELECT MAX(s2.changedate)   
                      FROM statuses s2
                      WHERE s2.proj = p.id
                     );

A more typical solution uses ROW_NUMBER():

SELECT p.id, p.title, s.status
FROM projects p INNER JOIN
     (SELECT s.*,
             ROW_NUMBER() OVER (PARTITION BY s.proj ORDER BY s.changedate DESC) as seqnum
      FROM statuses s
     ) s
     ON s.proj = p.id AND seqnum = 1
WHERE p.active = 1;

And a more inscrutable way uses APPLY:

SELECT p.*, s.status
FROM projects p OUTER APPLY
     (SELECT TOP 1 s.*
      FROM status s
      WHERE s.proj = p.id
      ORDER BY s.changedate DESC
     ) s
WHERE p.active = 1;

Well, this is only inscrutable is you've never seen APPLY before.