user3209752 user3209752 - 7 months ago 94
SQL Question

MySQL: how to select record with latest date before a certain date

Assume a simple table as follows containing status events for two users. A status_id of 1 makes them 'active', anything else makes them de-active.
I need to find out all those users that became inactive within one year of, for example, 2015-05-01 (not including that date).

CREATE TABLE user_status(
user_id INT,
status_id INT,
date_assigned VARCHAR(10) );

INSERT INTO user_status( user_id, status_id, date_assigned)
VALUES
(1234, 1, '2015-01-01'), -- 1234 becomes active (status id = 1)
(1234, 2, '2015-07-01'), -- 1234 de-activated for reason 2

(5678, 1, '2015-02-01'), -- 5678 becomes active (status id = 1)
(5678, 3, '2015-04-01'), -- 5678 de-activated for reason 3
(5678, 5, '2015-06-01'); -- 5678 de-activated for reason 5


Using the query

SELECT t1.*
FROM user_status t1
WHERE t1.date_assigned = (SELECT MIN(t2.date_assigned) -- the first occurrence
FROM user_status t2
WHERE t2.user_id = t1.user_id -- for this user
AND t2.status_id <> 1 -- where status not active
AND t2.date_assigned BETWEEN -- within 1 yr of given date
'2015-05-01' + INTERVAL 1 DAY -- (not including that date)
AND
'2015-05-01' + INTERVAL 1 YEAR
)


I can get the result

user_id status_id date_assigned
1234 2 2015-07-01
5678 5 2015-06-01


This is sort of right but user 5678 should not be there because although they had an inactive event within the date range, they were already inactive before the desired date range began and so did not become inactive within that range.

I need to add a bit to my query along the lines of 'only show me those users who had an inactive event and where the previous status_id for that user was 1, ie they were active at the time the inactive event happened.

Can anyone help me to get the syntax correct?

See SQL fiddle

Answer

You can add NOT EXISTS to your query:

SELECT t1.*
FROM user_status t1
WHERE t1.date_assigned = (SELECT MIN(t2.date_assigned)       -- the first occurance
                          FROM  user_status t2
                          WHERE t2.user_id =  t1.user_id     -- for this user
                          AND t2.status_id <> 1              -- where status not active
                          AND t2.date_assigned BETWEEN       -- within 1 yr of given date
                               '2015-05-01' + INTERVAL 1 DAY -- (not including that date)
                          AND 
                               '2015-05-01' + INTERVAL 1 YEAR 
                          )
      AND NOT EXISTS (SELECT 1                         -- such a record should not exist
                      FROM user_status t3
                      WHERE t3.user_id =  t1.user_id   -- for this user
                             AND t3.status_id <> 1     -- where status is not active
                             AND t3.date_assigned <    -- before the examined period
                               '2015-05-01' + INTERVAL 1 DAY )

Demo here

Edit:

You can use the following query that also considers the case of having multiple activation dates:

SELECT *
FROM user_status
WHERE (user_id, date_assigned) IN (
  -- get last de-activation date   
  SELECT t1.user_id, MAX(t1.date_assigned)
  FROM user_status AS t1
  JOIN (
    -- get last activation date 
    SELECT user_id, MAX(date_assigned) AS activation_date
    FROM user_status 
    WHERE status_id = 1
    GROUP BY user_id
  ) AS t2 ON t1.user_id = t2.user_id AND t1.date_assigned > t2.activation_date
  GROUP BY user_id
  HAVING MAX(date_assigned) BETWEEN '2015-05-01' + INTERVAL 1 DAY AND '2015-05-01' + INTERVAL 1 YEAR AND
         MIN(date_assigned) > '2015-05-01' + INTERVAL 1 DAY)