Martin AJ Martin AJ - 1 month ago 19
MySQL Question

The order of columns in composite indexes

I have a table like this:

// posts
+----+-------------+-----------------------------+
| id | title | body |
+----+-------------+-----------------------------+
| 1 | First Post | The content of first post |
| 2 | Second Post | The content of second post |
+----+-------------+-----------------------------+


Now I need to search into both
title
and
body
columns. Noted that I want to search into those two columns as full-text.

My question: Do I need to make either one composite index on those two columns or two single indexes on them separately?

In other word, which one?


  • index:
    tb(title,body)
    | query:
    WHERE MATCH(title,body) AGAINST(?,?)

  • index:
    t(title), b(body)
    | query:
    WHERE MATCH(title) AGAINST(?) OR MATCH(body) AGAINST(?)



Honestly I cannot understand the different of these ^.




Also yeah I know, the title of my question isn't matched with the content.

Answer

Your two indexing options will give you (except for very rare cases involving myisam tables) the same rows, but with some slight differences:

  • the combined fulltext index will be at least 50% faster, because it will have to only look up one index and doesn't have to combine the two results
  • the combined fulltext index will not allow you to search only in one of the two columns, if you want to do so in a different query. Though you can of course create all three indexes to keep that possibility open.
  • the resulting order will be different: the natural language search mode (which is the default mode) will, unless you specify a different order in your query, automatically order by relevance (which in your second case has no direct meaning). In most cases, it will not make sense to do such a search without ordering by the relevance, so you will have to specify an order, e.g. order by (MATCH(title) AGAINST(?) + MATCH(body) AGAINST(?)). This will slow your query down a bit more.
  • because relevance is calculated/normalized with respect to the content of all other rows, and the content per index is different for the two cases, order by (MATCH(title) AGAINST(?) + MATCH(body) AGAINST(?)) will yield a slightly different order than order by MATCH(title, body) AGAINST(?,?) (which is the default order in natural mode) would. But to emphasize again, you will get the same rows, just in a slightly different order.
  • if you use one index, you cannot make the first column more important than your second one, they are treated as one column. If you have seperate indexes, you can calculate a personal relevance, e.g. order by (MATCH(title) AGAINST(?) + MATCH(body) AGAINST(?) * 0.5) to make a result in the topic more important than a result in the body

So, to summarize:

  • one combined fulltext index: faster, but no control over relative weights of the columns. If you don't need the weights, use this.
  • two fulltext indexes: slower, but control over column weights (including the possibility to mess it up if used incorrectly). If you need the weights, use this.