elplatt elplatt - 1 month ago 15
MySQL Question

Why MySQL not using index for join on single column (no where or group by)?

Here's my query:

mysql> explain
select * from
(select * from ratings where project_id=1) as r
left join article_name_id as a
force index for join (idx_article_name)
using(article_name);


Here's the output of EXPLAIN. Why won't it use the index for the join?

+----+-------------+------------+------+------------------+-------------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+------------------+-------------+---------+------+----------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 1725 | |
| 1 | PRIMARY | a | ALL | idx_article_name | NULL | NULL | NULL | 20441326 | |
| 2 | DERIVED | ratings | ref | idx_project | idx_project | 5 | | 1724 | Using where |
+----+-------------+------------+------+------------------+-------------+---------+------+----------+-------------+


Edit:
Here's an updated query/explain based on suggestions so far.
idx_article_name_id
is an index on
article_name_id (article_name, article_id)
.

mysql> explain
select r.*, a.article_id from
ratings as r
left join article_name_id as a
force index for join (idx_article_name_id)
using (article_name)
where project_id=1;

+----+-------------+-------+------+---------------------+-------------+---------+-------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------------+-------------+---------+-------+----------+-------------+
| 1 | SIMPLE | r | ref | idx_project | idx_project | 5 | const | 1724 | Using where |
| 1 | SIMPLE | a | ALL | idx_article_name_id | NULL | NULL | NULL | 20441326 | |
+----+-------------+-------+------+---------------------+-------------+---------+-------+----------+-------------+


And here's the schema

CREATE TABLE `article_name_id` (
`row_id` int(11) NOT NULL AUTO_INCREMENT,
`article_name` varchar(256) DEFAULT NULL,
`article_id` int(11) DEFAULT NULL,
`from_ts` datetime DEFAULT NULL,
`to_ts` datetime DEFAULT NULL,
PRIMARY KEY (`row_id`),
KEY `idx_article_name` (`article_name`(191)),
KEY `idx_article_name_id` (`article_name`(191),`article_id`)
) ENGINE=InnoDB AUTO_INCREMENT=20268652 DEFAULT CHARSET=utf8mb4

Answer

The most likely explanation is that the optimizer is estimating the cost of the full table scan is less than the cost of using the index.

The FORCE keyword doesn't actually force the optimizer to use an index. It only tells the optimizer that the cost of a full table scan is very expensive.

Assuming that the specified index is not a covering index, the * in the SELECT list means MySQL is going to have to visit the pages in the underlying table to get the values of all the columns. Likely, the optimizer is estimating the number of rows that will be retrieved is a significant percentage of the rows in the table. The cost of using the index would be lower only if the query is retrieving a small subset of the rows. Otherwise, a full scan is going to be more efficient.

I suspect the derived table has an influence on the plan, MySQL doesn't know the distribution of values in the article_name column of that derived table.

If you are attempting to improve performance, adding an index hint is probably not the right solution.

Comments