Johntk Johntk - 21 days ago 8
SQL Question

Most efficient SQL Statement: Exists vs IN

I have three tables that I need to

JOIN
to get values from two columns.
These columns are
GRN_STATUS
and
STATUS
I have written some SQL that achives the desired result but I've been advised that using
IN
is very inefficient and that I should use
EXISTS
instead.

I'm just wondering is this true in my situation? and what would a solution using
EXISTS
instead of
IN
look like?

SQL:

SELECT c.GRN_STATUS, a.STATUS
FROM
TableA a
INNER JOIN
TableB b
ON a.ORD_NO = b.ORD_NO
AND a.COMPANY_ID = b.COMPANY_ID
INNER JOIN
TableC c
ON b.GRN_NO = c.GRN_NO
AND b.COMPANY_ID = c.COMPANY_ID
AND a.STATUS IN ( 'B', 'C', 'D', 'E' )
AND c.GRN_STATUS = 'A';

Answer

In general, it depends on the implementation in the DBMS.

EXISTS mostly stops and returns at the first match so it COULD be more efficient, but it makes no sense when you have a list of constants.

Since SQL is a declarative language, you can't tell the DBMS the how, just the what. You describe the expected result and it is up to the server to try to find the most efficient way to fulfill your request.

The way the DBMS finds the efficient algorithm is based on several things including the amount and the distribution of the data, the actual statistics, the expected resources needed, etc.

So EXISTS may perform better on a huge table, while has no effect on smaller ones (or vica versa).

Your best bet is to actually check the estimated query plans or try them out.

My personal view is to use EXISTS when no data is required and JOIN when data is required. IN is for constant lists.