Dulock Dulock - 20 days ago 5
MySQL Question

MySQL Multi-Row Sub Query fix

I am trying to return all data, from all users, before a date/time which is unique to each user.

I can return all the data for the exact date for each user, no problem, but I don't know how to expand my query to return dates less than or equal to my key date.

I have a 2 table SQL database.

Table 1 - "sales" has a confirmation record of all transactions.

  • user_id (date, and then various purchase information)

Table 2 - "user_data" has a record of all browsing data

  • user_id, record_date, purchase_made (purchase_made is misnamed, in that they triggered the purchase popup, but may not have actually made a purchase -- and then a whole bunch of user action data)

I need to get all the user_data records of users who have made purchases, prior to and including their initial purchase.

In order to get the exact purchase records I can use the following:

FROM user_data
WHERE user_id IN (SELECT user_id FROM sales)
AND record_date IN (SELECT min(record_date) FROM user_data WHERE purchase_made = 1 GROUP BY user_id);

Now all I really want is to be able to change that second "IN" to "<=" but I can't do that with a Subquery that returns more than 1 row.

Disclaimer -- I didn't write the database, I don't have permission to change the database (nor would I be the right person to do so)

Disclaimer 2 -- I know that using purchase_made will foul up the results a bit because not all of these will be purchases. But because of the reporting, the date from the sales table can be off by enough time that it is actually LESS reliable than using the timestamp associated with purchase_made (proven through extensive query comparisons run on the actual data set). But the mandate from above is that it doesn't matter, they'll be happy with "mostly" accurate.


"Now all I really want is to be able to change that second "IN" to "<="

so make it a correlated subqery by coorlating the records we eliminate the need for the "IN" and now can use <=

FROM user_data UD
WHERE user_id IN (SELECT user_id FROM sales) 
AND record_date <=(SELECT min(record_date) 
                   FROM user_data UD2 
                   WHERE UD2.purchase_made = 1 
                     and UD2.User_ID = UD.User_ID);

Another approach would be to add user_Id to the subquery and instead of a subquery make it a inline view that you join to based on the user_ID and then on the join criteria compare the dates.

FROM user_data UD
INNER JOIN (SELECT min(record_date) mrd, User_ID 
            FROM user_data 
            WHERE purchase_made = 1) UD2
  ON UD.user_Id = UD2.user_Id
 AND UD.record_date <= UD2.RecordDate
WHERE user_id IN (SELECT user_id FROM sales)