user2146441 user2146441 - 1 month ago 6
MySQL Question

Combine and flatten tables with a FULL OUTER JOIN in MySQL

I have two tables of price updates from two sources (A and B) in the following format. These are price updates for Source A to the millisecond:

index,timestamp,origin,product,bid,ask,nexttimestamp
A1,2016-10-18 20:39:25.595,A,Dow,7005.5,7007.5,2016-10-18 20:39:34.701
A2,2016-10-18 20:39:34.701,A,Dow,7005.8,7007.8,2016-10-18 20:39:35.703
A3,2016-10-18 20:39:35.703,A,Dow,7005.5,7007.5,2016-10-18 20:39:38.432
A4,2016-10-18 20:39:38.432,A,Dow,7005.8,7007.8,2016-10-18 20:39:44.055
A5,2016-10-18 20:39:44.055,A,Dow,7005.3,7007.3,2016-10-18 20:39:45.146


These are price updates for Source B to the millisecond:

index,timestamp,origin,product,bid,ask,nexttimestamp
B1,2016-10-18 20:39:21.694,B,Dow,7005.8,7006.8,2016-10-18 20:39:27.987
B2,2016-10-18 20:39:27.987,B,Dow,7005.9,7006.9,2016-10-18 20:39:28.058
B3,2016-10-18 20:39:28.058,B,Dow,7006.0,7007.0,2016-10-18 20:39:39.792
B4,2016-10-18 20:39:39.792,B,Dow,7006.2,7007.2,2016-10-18 20:39:39.868
B5,2016-10-18 20:39:39.868,B,Dow,7006.3,7007.3,2016-10-18 20:39:45.845


I've marked an index (A1,B1 etc) in the first column. I need to combine and flatten these tables, so that I can see the prevailing price for each source, as updates for the other arrive. The fields 'timestamp' and 'nexttimestamp' are basically the starttime and endtime for the validity of the price update. The field nexttimestamp is creating by finding the next price update for the same source in the table.

I need to join Tables A and B where the timestamp in B is greater than the timestamp in A and less than the 'nexttimestamp' in A i.e. if the price update is valid. I need to do a join that shows something like this:

NULL, B1,2016-10-18 20:39:21.694,B,Dow,7005.8,7006.8,2016-10-18 20:39:27.987
A1,2016-10-18 20:39:25.595,A,Dow,7005.5,7007.5,2016-10-18 20:39:34.701,B1,2016-10-18 20:39:21.694,B,Dow,7005.8,7006.8,2016-10-18 20:39:27.987
A1,2016-10-18 20:39:25.595,A,Dow,7005.5,7007.5,2016-10-18 20:39:34.701,B2,2016-10-18 20:39:27.987,B,Dow,7005.9,7006.9,2016-10-18 20:39:28.058
A1,2016-10-18 20:39:25.595,A,Dow,7005.5,7007.5,2016-10-18 20:39:34.701,B3,2016-10-18 20:39:28.058,B,Dow,7006.0,7007.0,2016-10-18 20:39:39.792
A2,2016-10-18 20:39:34.701,A,Dow,7005.8,7007.8,2016-10-18 20:39:35.703,B3,2016-10-18 20:39:28.058,B,Dow,7006.0,7007.0,2016-10-18 20:39:39.792
A3,2016-10-18 20:39:35.703,A,Dow,7005.5,7007.5,2016-10-18 20:39:38.432,B3,2016-10-18 20:39:28.058,B,Dow,7006.0,7007.0,2016-10-18 20:39:39.792
A4,2016-10-18 20:39:38.432,A,Dow,7005.8,7007.8,2016-10-18 20:39:44.055,B3,2016-10-18 20:39:28.058,B,Dow,7006.0,7007.0,2016-10-18 20:39:39.792
A4,2016-10-18 20:39:38.432,A,Dow,7005.8,7007.8,2016-10-18 20:39:44.055,B4,2016-10-18 20:39:39.792,B,Dow,7006.2,7007.2,2016-10-18 20:39:39.868
A4,2016-10-18 20:39:38.432,A,Dow,7005.8,7007.8,2016-10-18 20:39:44.055,B5,2016-10-18 20:39:39.868,B,Dow,7006.3,7007.3,2016-10-18 20:39:45.845
A5,2016-10-18 20:39:44.055,A,Dow,7005.3,7007.3,2016-10-18 20:39:45.146,NULL


I have been trying queries such as the following, but to no avail.

select main.*, sub.*
from test as main
left join test as sub on sub.timestamp > main.timestamp and sub.timestamp < main.nexttimestamp and sub.origin <> main.origin and sub.product = main.product
order by main.timestamp ;

Answer

You can build a FULL OUTER JOIN in MySQL combining LEFT JOIN and RIGHT JOIN with UNION:

select a.*, b.*
from table_a a
left join table_b b 
  on  b.timestamp > a.timestamp
  and b.timestamp < a.nexttimestamp

union all

select a.*, b.*
from table_a a
right join table_b b 
  on  b.timestamp > a.timestamp
  and b.timestamp < a.nexttimestamp
where a.index is null

In the second part (RIGHT JOIN) you need the IS NULL condition, because all other rows are already included in the first part (LEFT JOIN).

You can add more conditions in the ON clause if needed (like b.product = a.product).

Comments