C. Leo C. Leo - 7 days ago 6
MySQL Question

BigQuery - Combine three tables based on matching value or timestamp

Similar to question BigQuery combine tables based on closest timerstamp and matching value

I have three tables and for each row of table numberTwo I need to get the hint in table numberOne that has the same cod value and between those, the one that has the closest time when comparing time1 and time2. If cod is not presented in table numberOne it tries to get the hint matching the cod in table numberThree

To make it easier to understand what I need to do is:

Table numberOne:

| id | cod | hint | time1 |
---------------------------------------------------
| 1 | ABC | V | 2016-11-03 18:00:00 UTC |
| 2 | ABC | W | 2016-11-03 12:00:00 UTC |
| 3 | CDE | X | 2016-11-03 19:00:00 UTC |
| 4 | CDE | Y | 2016-11-03 19:30:00 UTC |
| 5 | EFG | Z | 2016-11-03 18:00:00 UTC |


Table numberTwo

| id | cod | value | time2 |
----------------------------------------------------
| 1 | ABC | xyz2 | 2016-11-03 18:20:00 UTC |
| 2 | FHK | h323 | 2016-11-03 11:30:00 UTC |
| 3 | ABC | rewq | 2016-11-03 09:00:00 UTC |
| 4 | IJK | abce | 2016-11-03 19:10:00 UTC |


Table numberThree

| id | cod | hint |
--------------------------
| 1 | FHK | tes1 |
| 2 | IJK | tes2 |
| 3 | MNK | tes3 |
| 4 | MOP | tes4 |


So, for row #1 of table numberTwo I would get all rows in table numberOne with cod: ABC

| 1 | ABC | V | 2016-11-03 18:00:00 UTC |
| 2 | ABC | W | 2016-11-03 12:00:00 UTC |


And between those I would get the one with the closest timestamp from time2:

| 1 | ABC | V | 2016-11-03 18:00:00 UTC |


If the cod is not presented in table numberOne it matches against table numberThree. The codes in numberOne and numberThree are unique. So there is no case where the same code will be presented in both tables. So it could try to match table numberThree first.

After processing every row I would have a table like this:

Desired Table

| id | cod | hint | value | time2 |
--------------------------------------------------------------
| 1 | ABC | V | xyz2 | 2016-11-03 18:20:00 UTC |
| 2 | FHK | tes1 | h323 | |
| 3 | ABC | W | rewq | 2016-11-03 09:00:00 UTC |
| 4 | IJK | tes2 | abce | |

Answer

try below

WITH 
/*
TableNumberOne AS (
  SELECT 1 AS id, 'ABC' AS cod, 'V' AS hint, TIMESTAMP '2016-11-03 18:00:00 UTC' AS time1 UNION ALL
  SELECT 2 AS id, 'ABC' AS cod, 'W' AS hint, TIMESTAMP '2016-11-03 12:00:00 UTC' AS time1 UNION ALL
  SELECT 3 AS id, 'CDE' AS cod, 'X' AS hint, TIMESTAMP '2016-11-03 19:00:00 UTC' AS time1 UNION ALL
  SELECT 4 AS id, 'CDE' AS cod, 'Y' AS hint, TIMESTAMP '2016-11-03 19:30:00 UTC' AS time1 UNION ALL
  SELECT 5 AS id, 'EFG' AS cod, 'Z' AS hint, TIMESTAMP '2016-11-03 18:00:00 UTC' AS time1 
),
TableNumberTwo AS (
  SELECT 1 AS id, 'ABC' AS cod, 'xyz2' AS value, TIMESTAMP '2016-11-03 18:20:00 UTC' AS time2 UNION ALL
  SELECT 2 AS id, 'FHK' AS cod, 'h323' AS value, TIMESTAMP '2016-11-03 11:30:00 UTC' AS time2 UNION ALL
  SELECT 3 AS id, 'ABC' AS cod, 'rewq' AS value, TIMESTAMP '2016-11-03 09:00:00 UTC' AS time2 UNION ALL
  SELECT 4 AS id, 'IJK' AS cod, 'abce' AS value, TIMESTAMP '2016-11-03 19:10:00 UTC' AS time2 
),
TableNumberThree AS (
  SELECT 1 AS id, 'FHK' AS cod, 'test1' AS hint UNION ALL
  SELECT 2 AS id, 'IJK' AS cod, 'test2' AS hint UNION ALL
  SELECT 3 AS id, 'MNK' AS cod, 'test3' AS hint UNION ALL
  SELECT 4 AS id, 'MOP' AS cod, 'test4' AS hint 
),
*/
tempTable AS (
  SELECT 
    t2.id, t2.cod, t2.value, t2.time2, t1.hint, 
    ROW_NUMBER() OVER(PARTITION BY t2.id, t2.cod, t2.value 
                      ORDER BY ABS(TIMESTAMP_DIFF(t2.time2, t1.time1, SECOND))) AS win
  FROM TableNumberTwo AS t2
  LEFT JOIN TableNumberOne AS t1
  ON t1.cod = t2.cod
)
SELECT 
  t1.id, t1.cod, IFNULL(t1.hint, t2.hint) AS hint, value, 
  IF(t1.hint IS NULL, NULL, time2) as time2
FROM tempTable AS t1
LEFT JOIN TableNumberThree AS t2
ON t1.cod = t2.cod AND t1.hint IS NULL
WHERE win = 1