Andreas Andreas - 1 month ago 4
MySQL Question

MySQL Performance - string vs integer

I have realized a very strange behavior of MySQL that I have no explanation for.

This is a not overly complex query:

SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = 2
OR enr.Enricher4State = 2
OR enr.Enricher5State = 2
OR enr.Enricher9State = 2
);


The columns Enricher3State, Enricher4State, Enricher5State, Enricher9State do have an index and are of the datatype int(11).

Now I have tried to change these Enricher[x]State to a string:

SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = '2'
OR enr.Enricher4State = '2'
OR enr.Enricher5State = '2'
OR enr.Enricher9State = '2'
);


Every common sense would say that the string variant should perform the same, or slower, because the datatype of the column is integer!

But apparently this is not the case!

Query with integer notation (first one): 7.23048825s

Query with string notation (last one): 5.22188450s

As you can see, there is a huge performance difference, even though the query cost is the same in both cases.

I have absolutely no clue how this difference could happen - and if this means, I should change all queries in my project using the string notation...

I am using MySQL version 5.7.10




According to your comments I have deactivated all services that write or read to the database and repeated the experiment.

A) The integer notation:

SET profiling=0;
SET profiling=1;

SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = 2
OR enr.Enricher4State = 2
OR enr.Enricher5State = 2
OR enr.Enricher9State = 2
);

SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = 2
OR enr.Enricher4State = 2
OR enr.Enricher5State = 2
OR enr.Enricher9State = 2
);

SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = 2
OR enr.Enricher4State = 2
OR enr.Enricher5State = 2
OR enr.Enricher9State = 2
);

SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = 2
OR enr.Enricher4State = 2
OR enr.Enricher5State = 2
OR enr.Enricher9State = 2
);

SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = 2
OR enr.Enricher4State = 2
OR enr.Enricher5State = 2
OR enr.Enricher9State = 2
);

SHOW PROFILES;


Execution time of each query:


  • 6.42429325

  • 5.95059900

  • 6.34392825

  • 6.53041775

  • 6.69593450



B) The string notation:

SET profiling=0;
SET profiling=1;

SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = '2'
OR enr.Enricher4State = '2'
OR enr.Enricher5State = '2'
OR enr.Enricher9State = '2'
);

SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = '2'
OR enr.Enricher4State = '2'
OR enr.Enricher5State = '2'
OR enr.Enricher9State = '2'
);

SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = '2'
OR enr.Enricher4State = '2'
OR enr.Enricher5State = '2'
OR enr.Enricher9State = '2'
);

SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = '2'
OR enr.Enricher4State = '2'
OR enr.Enricher5State = '2'
OR enr.Enricher9State = '2'
);

SELECT
COUNT(*)
FROM
incidents.incidents AS inc,
incidents.enrichment AS enr
WHERE
inc.Id <= 606734
AND inc.Id >= 1
AND inc.Id = enr.ParentTableId
AND (
enr.Enricher3State = '2'
OR enr.Enricher4State = '2'
OR enr.Enricher5State = '2'
OR enr.Enricher9State = '2'
);

SHOW PROFILES;


Execution time:


  • 5.07188875

  • 4.90356250

  • 4.86164300

  • 4.48403375

  • 5.06533725



As you can clearly see, the string notation is still faster!

The same behavior was also detected by other developers of my team, so I could exclude temporary stupidity of myself…

Answer

Since the fields are indexed and you have OR condition and query does have integer constant as the condition, MySQL may spend time on cross index join calculation and then do table scan and with string constant MySQL does not do index considerations and just does table scan.

It is the case when having indices on many fields that are used in OR condition is not advantage but extra work for MySQL.

The OR condition does not grant required indexing of participating fields, quite often having indices on the "1,2,3,4" fields is bad for the table. These fields should be taken out to a separate table.

Added: Run EXPLAIN and if you see indices "1,2,3,4" fields listed for considered keys, that is what MySQL is spending the time on.