Lex Lex - 12 days ago 6
MySQL Question

Why would this query run so slow?

I have two MySQL tables say A and B. A contains just one varchar column (lets call that one A1) with about 23000 records. Table B (70000 records) has some more columns, one of the corresponding with A1 from table A (lets call that one B1). I want to know which values in A are not in the corresponding column in B, so I use:

SELECT A1
FROM A
LEFT JOIN B
ON A1 = B1
WHERE B1 IS NULL


Both columns A1 and B1 have indices defined on them. Still this query runs very slow. I've run explain, this is the output:

id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE A index \N PRIMARY 767 \N 23269 Using index
1 SIMPLE B ALL \N \N \N \N 70041 Using where; Not exists


UPDATE:
SHOW CREATE TABLE
for both tables (changed the original names);

CREATE TABLE `A` (
`A1` varchar(255) NOT NULL,
PRIMARY KEY (`A1`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8


CREATE TABLE `B` (
`col1` int(10) unsigned NOT NULL auto_increment,
`col2` datetime NOT NULL,
`col3` datetime default NULL,
`col4` datetime NOT NULL,
`col5` varchar(30) NOT NULL,
`col6` int(10) default NULL,
`col7` int(11) default NULL,
`col8` varchar(20) NOT NULL,
`B1` varchar(255) default NULL,
`col10` tinyint(1) NOT NULL,
`col11` varchar(255) default NULL,
PRIMARY KEY (`col1`),
KEY `NewIndex1` (`B1`)
) ENGINE=MyISAM AUTO_INCREMENT=70764 DEFAULT CHARSET=latin1


'nother edit:
data_length
and
index_length
from
SHOW TABLE STATUS


table data_length index_length
A 465380 435200
B 5177996 1344512

Answer

The character sets of the two columns that you are comparing in an OUTER JOIN differ. I am not sure if this is the cause so I tested and got these results:

SELECT A1
FROM A
LEFT JOIN B ON A1 = B1
WHERE B1 IS NULL

-- Table A..: 23258 rows, collation = utf8_general_ci
-- Table B..: 70041 rows, collation = latin1_swedish_ci
-- Time ....: I CANCELLED THE QUERY AFTER 20 MINUTES

-- Table A..: 23258 rows, collation = latin1_swedish_ci
-- Table B..: 70041 rows, collation = latin1_swedish_ci
-- Time ....: 0.187 sec

-- Table A..: 23258 rows, collation = utf8_general_ci
-- Table B..: 70041 rows, collation = utf8_general_ci
-- Time ....: 0.344 sec

Solution: make the character sets of the two tables (or the two columns atleast) same.