Kevin Wu Kevin Wu - 9 days ago 7
SQL Question

How does a SQL Recursive Query not run into an endless loop?

Could someone please explain why the query below will not run into an endless loop? Perhaps, this example from the PostgreSQL website will help:

WITH RECURSIVE t(n) AS (
VALUES (1)
UNION
SELECT n+1 FROM t WHERE n < 100
)
SELECT * FROM t;


In SQL,
from
is evaluated by
where
. So will we not continuously enter t without ever evaluating the
where
clause?

Answer

There is nothing "recursive" in "recursive" queries.
It should have been called "iterative".

There are some differences between vendors but the basic concept is the same:

  1. The anchor part (the one that doesn't refer to the "recursive" query name) creates the initial set.

  2. The iterative part (the one that refers to the "recursive" query name) is using the last set to creates a new set that now becomes the last set, and so on.
    It stops when it gets to an empty set.


And here is an endless query:

with recursive t as (select 1 union all select 1 from t) 
select count(*) from t

Explanation for the OP example

Initial set created by the anchor part, `VALUES (1)`: 
1 record, n=1

Sets created by the iterative part, `SELECT n+1 FROM t WHERE n < 100`:
1 record, n=2 (the initial set has 1 record with n=1 so `SELECT n+1 from t` returns 2)
1 record, n=3 
1 record, n=4
1 record, n=5
.
.
.
1 record, n=99
1 record, n=100 

When n=100 the WHERE condition `WHERE n < 100` causes an empty set to be created 
and the iteration stops.

One way to think on iterative queries:

with        t0 as (select ...)
           ,t1 as (select ... t0 ...)
           ,t2 as (select ... t1 ...)
           ,t3 as (select ... t2 ...)
            .
            .
            .

            select * from t0
union all   select * from t1
union all   select * from t2
union all   select * from t3
union all   ...

t0 is a CTE with no dependencies in other CTE.
t1 is a CTE with dependency in t0.
t2 is a CTE with dependency in t1 (and only t1!).
t3 is a CTE with dependency in t2 (and only t2!).
etc.

t1, t2, t3 etc.. are declared with identical queries, different only in their dependencies.