JoshG - 1 year ago 107
MySQL Question

# Using IS NULL or IS NOT NULL on join conditions - Theory question

Theory question here:

Why does specifying table.field IS NULL or table.field IS NOT NULL not work on a join condition (left or right join for instance) but only in the where condition?

Non working Example:

-this should return all shipments with any returns (non null values) filtered out. However, this returns all shipments regardless if anything meets the [r.id is null] statement.

``````SELECT
*
FROM
shipments s
LEFT OUTER JOIN returns r
ON s.id = r.id
AND r.id is null
WHERE
s.day >= CURDATE() - INTERVAL 10 DAY
``````

Working example:

-This returns the correct amount of rows which is total shipments, less any related to a returns (non null values).

``````SELECT
*
FROM
shipments s
LEFT OUTER JOIN returns r
ON s.id = r.id
WHERE
s.day >= CURDATE() - INTERVAL 10 DAY
AND r.id is null
``````

Why is this the case? All other filter conditions between two tables being joined work just fine, but for some reason IS NULL and IS NOT NULL filters do not work unless in the where statement.

What is the reason for this?

Example with tables A and B:

`````` A (parent)       B (child)
============    =============
id | name        pid | name
------------    -------------
1 | Alex         1  | Kate
2 | Bill         1  | Lia
3 | Cath         3  | Mary
4 | Dale       NULL | Pan
5 | Evan
``````

If you want to find parents and their kids, you do an `INNER JOIN`:

``````SELECT id,  parent.name AS parent
, pid, child.name  AS child

FROM
parent  INNER JOIN  child
ON   parent.id     =    child.pid
``````

Result is that every match of a `parent`'s `id` from the left table and a `child`'s `pid` from the second table will show as a row in the result:

``````+----+--------+------+-------+
| id | parent | pid  | child |
+----+--------+------+-------+
|  1 | Alex   |   1  | Kate  |
|  1 | Alex   |   1  | Lia   |
|  3 | Cath   |   3  | Mary  |
+----+--------+------+-------+
``````

Now, the above does not show parents without kids (because their ids do not have a match in child's ids, so what do you do? You do an outer join instead. There are three types of outer joins, the left, the right and the full outer join. We need the left one as we want the "extra" rows from the left table (parent):

``````SELECT id,  parent.name AS parent
, pid, child.name  AS child

FROM
parent  LEFT JOIN  child
ON   parent.id    =    child.pid
``````

Result is that besides previous matches, all parents that do not have a match (read: do not have a kid) are shown too:

``````+----+--------+------+-------+
| id | parent | pid  | child |
+----+--------+------+-------+
|  1 | Alex   |   1  | Kate  |
|  1 | Alex   |   1  | Lia   |
|  3 | Cath   |   3  | Mary  |
|  2 | Bill   | NULL | NULL  |
|  4 | Dale   | NULL | NULL  |
|  5 | Evan   | NULL | NULL  |
+----+--------+------+-------+
``````

Where did all those `NULL` come from? Well, MySQL (or any other RDBMS you may use) will not know what to put there as these parents have no match (kid), so there is no `pid` nor `child.name` to match with those parents. So, it puts this special non-value called `NULL`.

My point is that these `NULLs` are created (in the result set) during the `LEFT OUTER JOIN`.

So, if we want to show only the parents that do NOT have a kid, we can add a `WHERE child.pid IS NULL` to the `LEFT JOIN` above. The `WHERE` clause is evaluated (checked) after the `JOIN` is done. So, it's clear from the above result that only the last three rows where the `pid` is NULL will be shown:

``````SELECT id,  parent.name AS parent
, pid, child.name  AS child

FROM
parent  LEFT JOIN  child
ON   parent.id    =    child.pid

WHERE child.pid IS NULL
``````

Result:

``````+----+--------+------+-------+
| id | parent | pid  | child |
+----+--------+------+-------+
|  2 | Bill   | NULL | NULL  |
|  4 | Dale   | NULL | NULL  |
|  5 | Evan   | NULL | NULL  |
+----+--------+------+-------+
``````

Now, what happens if we move that `IS NULL` check from the `WHERE` to the joining `ON` clause?

``````SELECT id,  parent.name AS parent
, pid, child.name  AS child

FROM
parent  LEFT JOIN  child
ON   parent.id    =    child.pid
AND  child.pid IS NULL
``````

In this case the database tries to find rows from the two tables that match these conditions. That is, rows where `parent.id = child.pid` AND `child.pid IN NULL`. But it can find no such match because no `child.pid` can be equal to something (1, 2, 3, 4 or 5) and be NULL at the same time!

So, the condition:

``````ON   parent.id    =    child.pid
AND  child.pid IS NULL
``````

is equivalent to:

``````ON   1 = 0
``````

which is always `False`.

So, why does it return ALL rows from the left table? Because it's a LEFT JOIN! And left joins return rows that match (none in this case) and also rows from the left table that do not match the check (all in this case):

``````+----+--------+------+-------+
| id | parent | pid  | child |
+----+--------+------+-------+
|  1 | Alex   | NULL | NULL  |
|  2 | Bill   | NULL | NULL  |
|  3 | Cath   | NULL | NULL  |
|  4 | Dale   | NULL | NULL  |
|  5 | Evan   | NULL | NULL  |
+----+--------+------+-------+
``````

I hope the above explanation is clear.

Sidenote (not directly related to your question): Why on earth doesn't `Pan` show up in none of our JOINs? Because his `pid` is `NULL` and NULL in the (not common) logic of SQL is not equal to anything so it can't match with any of the parent ids (which are 1,2,3,4 and 5). Even if there was a NULL there, it still wouldn't match because `NULL` does not equal anything, not even `NULL` itself (it's a very strange logic, indeed!). That's why we use the special check `IS NULL` and not a `= NULL` check.

So, will `Pan` show up if we do a `RIGHT JOIN` ? Yes, it will! Because a RIGHT JOIN will show all results that match (the first INNER JOIN we did) plus all rows from the RIGHT table that don't match (which in our case is one, the `(NULL, 'Pan')` row.

``````SELECT id,  parent.name AS parent
, pid, child.name  AS child

FROM
parent  RIGHT JOIN  child
ON   parent.id     =    child.pid
``````

Result:

``````+------+--------+------+-------+
| id   | parent | pid  | child |
+---------------+------+-------+
|   1  | Alex   |   1  | Kate  |
|   1  | Alex   |   1  | Lia   |
|   3  | Cath   |   3  | Mary  |
| NULL | NULL   | NULL | Pan   |
+------+--------+------+-------+
``````

Unfortunately, MySQL does not have `FULL JOIN`. You can try it in other RDBMSs, and it will show:

``````+------+--------+------+-------+
|  id  | parent | pid  | child |
+------+--------+------+-------+
|   1  | Alex   |   1  | Kate  |
|   1  | Alex   |   1  | Lia   |
|   3  | Cath   |   3  | Mary  |
|   2  | Bill   | NULL | NULL  |
|   4  | Dale   | NULL | NULL  |
|   5  | Evan   | NULL | NULL  |
| NULL | NULL   | NULL | Pan   |
+------+--------+------+-------+
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download