Tay Tay - 7 months ago 13
SQL Question

Find the most recent date in a result set

I'm working on a query where I need to look at patient vitals (specifically blood pressure) that were entered when a patient visited a clinic. I'm pulling results for the entire year of 2015, and of course there are certain patients who visited multiple times, and I need to only see the vitals that were entered at the most recent visit. Another slight twist is that systolic and diastolic pressures are entered separately, so I end up with results like:

Patient ID Name DOB Test Results Date
---------------------------------------------------------------------------------
1000 John Smith 1/1/1955 BP - Diastolic 120 2/10/2015
1000 John Smith 1/1/1955 BP - Systolic 70 2/10/2015
1000 John Smith 1/1/1955 BP - Diastolic 128 7/12/2015
1000 John Smith 1/1/1955 BP - Systolic 75 7/12/2015
1000 John Smith 1/1/1955 BP - Diastolic 130 10/22/2015
1000 John Smith 1/1/1955 BP - Systolic 76 10/22/2015
9999 Jane Doe 5/4/1970 BP - Diastolic 130 4/2/2015
9999 Jane Doe 5/4/1970 BP - Systolic 60 4/2/2015
9999 Jane Doe 5/4/1970 BP - Diastolic 127 11/20/2015
9999 Jane Doe 5/4/1970 BP - Systolic 65 11/20/2015


There are 26,000+ results so obviously I don't want to go through every single patient and see when their most recent results were. I'd like my results to look like this:

Patient ID Name DOB Test Results Date
---------------------------------------------------------------------------------
1000 John Smith 1/1/1955 BP - Diastolic 130 10/22/2015
1000 John Smith 1/1/1955 BP - Systolic 76 10/22/2015
9999 Jane Doe 5/4/1970 BP - Diastolic 127 11/20/2015
9999 Jane Doe 5/4/1970 BP - Systolic 65 11/20/2015


I know the name and date of birth and whatnot would get repeated, but I'm mainly focused on the results column.

Here's my query:

SELECT DISTINCT
pd.PatientID as [Patient ID],
pd.PatientName as Name,
pd.DateOfBirth as DOB,
v.Test as Test,
v.Results as Results,
v.TestDate as Date

FROM PatientDemographic pd JOIN Vitals v ON pd.PatientID = v.PatientID

WHERE v.TestDate BETWEEN '01/01/2015' AND '12/31/2015'
AND v.Test LIKE 'BP%'

ORDER BY pd.PatientID, v.TestDate


After looking around for other answers, I tried doing a
GROUP BY
and the
MAX()
aggregate function for the
v.TestDate
column in the
SELECT
statement (I was specifically referencing this link, though it's for Oracle and I'm using SQL Server so I'm not entirely sure if the syntax would be the same). My query then looked like:

SELECT DISTINCT
pd.PatientID as [Patient ID],
pd.PatientName as Name,
pd.DateOfBirth as DOB,
v.Test as Test,
v.Results as Results,
MAX(v.TestDate) as Date

FROM PatientDemographic pd JOIN Vitals v ON pd.PatientID = v.PatientID

WHERE v.TestDate BETWEEN '01/01/2015' AND '12/31/2015'
AND v.Test LIKE 'BP%'

GROUP BY pd.PatientID


Admittedly, I've always struggled a little with using
GROUP BY
. In this particular case, I get an error stating that I need to add the Patient Name column to the
GROUP BY
clause as well, so I do, and then it asks for DOB. Then the Test name. Basically, it wants me to add everything from my
SELECT
statement to the
GROUP BY
.

What's the best way to proceed and get my most recent patient visits?

Answer

One simple method uses ROW_NUMBER() to find the more recent record for each test:

SELECT pd.PatientID as [Patient ID], pd.PatientName as Name, pd.DateOfBirth as DOB,
       v.Test as Test, v.Results as Results, v.TestDate as Date
FROM PatientDemographic pd JOIN
     (SELECT v.*,
             ROW_NUMBER() OVER (PARTITION BY PatientId, Test ORDER BY TestDate DESC) as seqnum
      FROM Vitals v
      WHERE v.TestDate BETWEEN '2015-01-01' AND '2015-12-31' AND
            v.Test LIKE 'BP%'
     ) v
     ON pd.PatientID = v.PatientID 
WHERE seqnum = 1
ORDER BY pd.PatientID, v.TestDate;