aampudia aampudia - 5 months ago 21
SQL Question

MYSQL - JOIN to the last record on right table PREVIOUS to record from left table

I've seen many questions similar to this one, but not quite like this.
What i need to do is to LEFT JOIN records, to the immediately previous record on the right table. for example:

LEFT TABLE 'PROD':
idMachine |production | prod.date
1 |10 | 2016-05-02
2 |11 | 2016-05-15
3 |14 | 2016-06-01
3 |13 | 2016-06-15

RIGHT TABLE 'CONFIG':
idMachine | sett | config.date
1 | 30 | 2016-04-01
2 | 25 | 2016-04-01
3 | 20 | 2016-04-01
1 | 40 | 2016-05-01
3 | 45 | 2016-06-10
1 | 50 | 2016-06-15

and the result should be:
idMachine |production | sett | prod.date
1 |10 | 40 | 2016-05-02
2 |11 | 25 | 2016-05-15
3 |14 | 20 | 2016-06-01
3 |13 | 45 | 2016-06-15


the system works like this, i record all production runs on machines, and i record every change on the config of each machine, what i need to know, is what settings it had for that production run.
If you notice, in the result, for machines 1 and 3, the join is not done to the LATEST record of the config, but to the LAST record before the production happened, in that way we could know what we produced with what settings.

I've tried joins with
JOIN CONFIGS ON prod.date > config.date
but it joins to EVERY config previous, not only the last. and if I use LIMIT it doesnt give me all the records i need.
I've tried some JOINS to subqueries using MAX, but the subquery returns only the latest config, and not the previous configs, so "old" runs dont have configs...

Any help is very appreciated!!!

Answer

The subquery also needs to be a JOIN of the two tables to allow you to get the preceding date in CONFIG for each PROD. Then include that subquery in a join with the tables again to find the sett value.

Something like this:

SELECT p2.idMachine, p2.production, c2.sett, p2.date
FROM PROD AS p2
INNER JOIN
   (SELECT p.idMachine, p.`date` AS prodDate, MAX(c.`date`) AS prev
    FROM PROD AS p 
    INNER JOIN CONFIG AS c 
       ON p.idMachine = c.idMachine AND p.`date` >= c.`date`
    GROUP BY p.idMachine, prodDate
) AS prevs
   ON p2.idMachine = prevs.idMachine AND p2.`date` = prevs.prodDate
INNER JOIN CONFIG AS c2
   ON prevs.idMachine = c2.idMachine AND prevs.prev = c2.`date`

If you have a WHERE clause to filter the PROD records you care about, you'll probably want to put it in both the subquery and main query... wait a sec...


Depending on how your data is "shaped" this version with one less join may be better.

SELECT p2.idMachine, p2.production, c2.sett, p2.date
FROM (SELECT p.idMachine, p.production, p.`date`, MAX(c.`date`) AS prevConfigDate
    FROM PROD AS p 
    INNER JOIN CONFIG AS c 
       ON p.idMachine = c.idMachine AND p.`date` >= c.`date`
    GROUP BY p.idMachine, p.production, p.`date`
) AS p2
INNER JOIN CONFIG AS c2
   ON p2.idMachine = c2.idMachine AND p2.prevConfigDate = c2.`date`

However, the subquery here calculates the MAX for every PROD, instead of just every (idMachine, date); so it may cost more than the extra join did. In this case though, the redundant WHERE clauses are not needed (just one copy in the subquery).


And on I go... there is also a correlated subquery version, though I am not fond of them...

SELECT p.*
    , (SELECT sett 
       FROM CONFIG AS c 
       WHERE c.idMachine = p.idMachine 
          AND c.date <= p.date 
       ORDER BY c.date DESC LIMIT 1
      ) AS sett
FROM PROD AS p
;

Also note that in all these I compared the dates with >= or <=, not < or >; you may need to adjust according to your actual data/logic.

Comments