neokio neokio - 3 months ago 5
MySQL Question

How do I improve performance on a MySQL query with NULL?

I have several million records in the following table:

CREATE TABLE `customers` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`store_id` int(10) unsigned DEFAULT NULL,
`first_name` varchar(64) DEFAULT NULL,
`middle_name` varchar(64) DEFAULT NULL,
`last_name` varchar(64) DEFAULT NULL,
`email` varchar(128) DEFAULT NULL,
`phone` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_store` (`store_id`),
KEY `index_store_email` (`store_id`,`email`),
KEY `index_store_phone` (`store_id`,`phone`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;


Query #1 takes ~800ms:

SELECT COUNT(*) FROM `customers` WHERE `store_id` = 1;


Query #2 takes ~1.5ms:

SELECT COUNT(*) FROM `customers` WHERE `store_id` = 1 AND `email` IS NULL;


Query #3 takes a whopping 5 seconds:

SELECT COUNT(*) FROM `customers` WHERE `store_id` = 1 AND `email` IS NOT NULL;


Notes:


  • I've simplified the table to ask the question, but the query is identical.

  • Yes, my table is optimized.

  • Yes, both fields are indexed, see the create syntax above.

  • There are only a few
    store_id
    s, but every record has one.

  • There are very few customers with
    email
    set to
    null
    .



I find a few things strange here:


  1. Query #1 is simplest! There are only a few possible INT values. Shouldn't it be fastest?

  2. Why is Query #3 so slow? I could cut the time in half by doing the other two queries, and subtracting #1 from #2, but I shouldn't have to.



Any thoughts on this seemingly basic question? Feel like I'm missing something simple. Did I sleep through a class in db school?

Answer

At times the MySQL query parser guesses wrong when it decides which indices to use. For cases like these the index hints can be useful.

To force the use of an index:

SELECT * FROM table1 USE INDEX (col1_index,col2_index)
  WHERE col1=1 AND col2=2 AND col3=3;

To force the use of an index including replacing table scans:

SELECT * FROM table1 FORCE INDEX (col1_index,col2_index)
  WHERE col1=1 AND col2=2 AND col3=3;

To ignore a certain index:

SELECT * FROM table1 IGNORE INDEX (col3_index)
  WHERE col1=1 AND col2=2 AND col3=3;
Comments