KeithS KeithS - 7 months ago 10
SQL Question

Which is faster - NOT IN or NOT EXISTS?

I have an insert-select statement that needs to only insert rows where a particular identifier of the row does not exist in either of two other tables. Which of the following would be faster?

INSERT INTO Table1 (...)
SELECT (...) FROM Table2 t2
WHERE ...
AND NOT EXISTS (SELECT 'Y' from Table3 t3 where t2.SomeFK = t3.RefToSameFK)
AND NOT EXISTS (SELECT 'Y' from Table4 t4 where t2.SomeFK = t4.RefToSameFK AND ...)


... or...

INSERT INTO Table1 (...)
SELECT (...) FROM Table2 t2
WHERE ...
AND t2.SomeFK NOT IN (SELECT RefToSameFK from Table3)
AND t2.SomeFK NOT IN (SELECT RefToSameFK from Table4 WHERE ...)


... or do they perform about the same? Additionally, is there any other way to structure this query that would be preferable? I generally dislike subqueries as they add another "dimension" to the query that increases runtime by polynomial factors.

Answer

Usually it does not matter if NOT IN is slower / faster than NOT EXISTS, because they are NOT equivalent in presence of NULL. Read:

NOT IN vs NOT EXISTS

In these cases you almost always want NOT EXISTS, because it has the usually expected behaviour.

If they are equivalent, it is likely that your database already has figured that out and will generate the same execution plan for both.

In the few cases where both options are aquivalent and your database is not able to figure that out, it is better to analyze both execution plans and choose the best options for your specific case.