Flying Gambit Flying Gambit - 6 months ago 8
SQL Question

How do correlated sub-queries work?

Consider a schema like

create table world
(
name varchar(255),
continent varchar(255),
area varchar(255)
);

insert into world values("Germany", "Europe", 20);
insert into world values("France", "Europe", 10);
insert into world values("Russia", "Europe", 30);
insert into world values("China", "Asia", 10);
insert into world values("Japan", "Asia", 20);


Then executing this query

select name, continent, area
from world t1
where area >= all (select area
from world t2
where t1.continent = t2.continent);


will list the largest country by area for each continent.

Here is the sql fiddle http://sqlfiddle.com/#!9/a66aee/4/0

What I understand from this article is that it processes each of the outer row and then proceeds to inner query.

If my understanding is right, then

for 1st outer row,

name = Germany, continent= Europe, area = 20


the inner-query would then look like

select area from world t2 where t2.continent = "Europe";


which would return area like this:

20
10
30


The outer query would then check

if t1.area >= 20, 10 and 30 .....(t1.area = 20)


which should return
20 and 30
.

If that's the case, then should it not return the rows:

"Germany", "Europe", 20
"Russia", "Europe", 30


If this is not right then how does SQL process a correlated query?

Answer

Well, you understood the basics, but not the ending.

Yes, the is how the query will be processed, except this part:

if t1.area >= 20, 10 and 30 .....(t1.area = 20) 

Your query checks if the currect t1.area which is now 20 is larger then ALL the values provided in the sub query, which for that row are (10,20,30) . Since 20 is smaller then 30 , then this condition is FALSE, so this row from the outer query will be filtered.

For Europe only area 30 should be returned, since its the only one bigger then all of the sub query results 30>= 10 && 30>= 20 && 30>= 30

Comments