David542 David542 - 1 year ago 57
MySQL Question

Comparing performance of query for year in date

How would the following three queries compare in terms of performance? I'm trying to get all records with year=2017:

Using

EXTRACT
:

SELECT count(*), completed_by_id FROM table
WHERE EXTRACT(YEAR FROM completed_on)=2017
GROUP BY completed_by_id
# Took 11.8s


Using
YEAR
:

SELECT count(*), completed_by_id FROM table
WHERE YEAR(completed_on)=2017
GROUP BY completed_by_id
# Took 5.15s


Using
LIKE 'YEAR%
'

SELECT count(*), completed_by_id FROM table
WHERE completed_on LIKE '2017%'
GROUP BY completed_by_id
# Took 6.61s


Note: In my own testing I found
YEAR()
to be the fastest,
LIKE
to be the second fastest, and
EXTRACT()
to be the slowest.

There are about 5M rows in the table and
completed_on
is
DATETIME
field that has been indexed.

Answer Source

You haven't described your table or indexes so all advice about query performance is guesswork.

If your completed_on column is a DATETIME, DATE, or TIMESTAMP type and it is indexed, this query will radically outperform all the ones you have shown, and maintain its performance as your table grows.

SELECT count(*), completed_by_id
  FROM table 
 WHERE completed_on >= '2017-01-01'
   AND completed_on <  '2017-01-01' + INTERVAL 1 YEAR
 GROUP BY completed_by_id

Why? It can do a range scan on the index rather than a nonsargable function call on each row's value.

Notice the use of >= at the beginning of the date range and < at the end. We want to include all rows from the first moment of new years day 2017, up until but not including the first moment of new years day 2018. BETWEEN can't do this, because it uses <= rather than < at the end of its range.

If an index is in place, both BETWEEN and the syntax I have shown use a range scan, and perform about the same.

For best results speeding up this query use a compound index on (completed_on, completed_by_id).

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download