I'm dealing with a situation at work where someone set up some tables without a primary key (arghhh). now I'm trying to compare dev data to prod data and the best I can tell is that theres a slight difference in the number of rows.
I figured the best way I could compare data is to do a join on every column, but I'm getting unexpected results.
to test this out I just joined the same table to itself. This table has 1309 rows.. but when I join on every column i get 1014. Thats less rows. If anything i'd expect to get more. What gives?
from `default_tire_classifications` tc1
join `default_tire_classifications` tc2
on tc1.`marketing_tread_name` = tc2.`marketing_tread_name`
AND tc1.`size` = tc2.`size`
AND tc1.product_category = tc2.product_category
AND tc1.application = tc2.application
AND tc1.vehicle_type = tc2.vehicle_type
AND tc1.oem_part = tc2.oem_part
AND tc1.position = tc2.position
AND tc1.size = tc2.size
AND tc1.sect_wdth = tc2.sect_wdth
AND tc1.aspect_ratio = tc2.aspect_ratio
AND tc1.rim_size = tc2.rim_size
AND tc1.speed_rating = tc2.speed_rating
AND tc1.load_index = tc2.load_index
I suspect some of the columns contain NULL values. An equality comparison to a NULL value yields NULL. (SQL tri-valued Boolean logic.)
To do a comparison that yields TRUE when both sides are NULL, you could do something like this
( tc1.col = tc2.col OR ( tc1.col IS NULL AND tc2.col IS NULL ) )
MySQL also provides a non-standard "null-safe" equality comparison operator
<=> (spaceship) that does the same thing.
tc1.col <=> tc2.col
returns either TRUE or FALSE, and will return TRUE when the values on both sides are NULL.
So, replacing the
= (equality comparison) operator with
<=> operator should resolve the problem with comparing NULL values.
(This isn't to say that NULL values is a problem, or is the only problem.)