Etienne Faucher Etienne Faucher - 3 months ago 12
SQL Question

Order in multiple "on" conditions in INNER Join

My question is : is there any order of execution in

ON
clauses in a typical
INNER JOIN
?

My tables are filled in about a million entries each table, I need to join them easily and I want to optimize my query.

Here's my example:

SELECT *

FROM TableA
INNER JOIN TableB ON

(
[costly validations]
)
AND

TableB.Date1 > '2016-01-01' AND
TableA.Date2 > '2016-01-01' AND
TableB.Date1 = TableA.Date2 AND
TableA.Field1 = TableB.Field1 AND
TableA.Field2 = TableB.Field2 AND
TableA.Field3 = TableB.Field3 AND
TableA.Field4 = TableB.Field18


where the costly validations are:

(
CAST(
SUBSTRING(
Convert(nvarchar(50), TableA.MoneyField1),
2,
(len(Convert(nvarchar(50), TableA.MoneyField1)) - 1)
) as money
) = TableB.MoneyField2
AND
len(
Convert(
nvarchar(50),
CAST(
SUBSTRING(
Convert(nvarchar(50), TableA.MoneyField1),
2,
(len(Convert(nvarchar(50), TableA.MoneyField1)) - 1)
) as money
)
)
) = len(TableB.MoneyField2)
)



Validations ideas : We were removing a dollar sign ($) in our very old applications from some strings received by a third party application before converting the string into money. The third party application removed the $ sign and now we are removing a digit from our numbers (oops). The applications are fixed but our data is corrupted.


Question is: is the order in the
ON
clause has any effect on the performance or optimization of the query? For example, if I put the costly validation at the end of the
ON
clause, is there any changes (I don't see any on my side, I tried).

My assumption is
No
since T-SQL is smarter than me.

so this:

ON
(
[costly validations]
)
AND

TableB.Date1 > '2016-01-01' AND
TableA.Date2 > '2016-01-01' AND
TableB.Date1 = TableA.Date2 AND
TableA.Field1 = TableB.Field1 AND
TableA.Field2 = TableB.Field2 AND
TableA.Field3 = TableB.Field3 AND
TableA.Field4 = TableB.Field18


versus this :

ON
TableB.Date1 > '2016-01-01' AND
TableA.Date2 > '2016-01-01' AND
TableB.Date1 = TableA.Date2 AND
TableA.Field1 = TableB.Field1 AND
TableA.Field2 = TableB.Field2 AND
TableA.Field3 = TableB.Field3 AND
TableA.Field4 = TableB.Field18 AND
(
[costly validations]
)

Answer

There is no guarantee on the order of evaluation. Both within a single ON clause and across multiple ON clauses and the WHERE clause.1

All that is required by the standard is that certain clauses are evaluated in a particular logical order, with database systems free to re-order evaluation provided they produce results consistent with the logical order. SQL goes even further beyond that, in some cases though - leading to the situation that it's very difficult to write code with guaranteed guards (e.g. if you want to do a/b and want to avoid the division by zero error, you have to jump through hoops to ensure you will not ever see that error. You can't just write b!=0 and a/b = <something>)


1This is unlike a lot of other languages in which their boolean AND and OR operators are defined to perform short-circuiting, and only evaluate their right-hand expression if the left-hand expression is insufficient to determine the entire result.