Dallas Caley Dallas Caley - 2 months ago 6
MySQL Question

MySQL strange behavior with comma separated list of numbers

I had a very complicated problem, but i narrowed it down to this, First, let me give you some test data:

Run this:

CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`value` text NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;

INSERT INTO test (value) VALUES
(1),
('1'),
('1,2'),
('3');


Now run this query:

SELECT * FROM test WHERE value = 1;


I would expect in this case to get only the first two rows, where the value is either entered as a numeric 1 or a '1' char, but for some reason this is what i get:

1, 1
2, 1
3, 1,2


My question is, why do i get the third row?

Note: This is my version of mysql: 5.6.28-0ubuntu0.14.04.1

Also, I already solved my original problem by using FIND_IN_SET and I am aware that it's not a very good idea to have this comma separated list type structure, ie, it should probably have been done with a join table in the first place. Unfortunately I'm working within a system that is very large and making that change is not practical at this time.

I'm just interested in why this specific behavior happens.

Answer

The reason you get the third row is implicit datatype conversion performed by MySQL. Your query has a predicate (condition) in the WHERE clause

  WHERE value = 1

On the right side of the equality comparison operator (the equal sign), we have a numeric literal. On the left side, we have a column that is datatype TEXT.

It's not possible for MySQL to do a comparison of those two different datatypes.

So, MySQL converts one side or the other to a type that is compatible, so a comparison can be performed. In this case, MySQL is converting the value from the column to be numeric, so it compare to the numeric literal.

As a demonstration of what that looks like, we can add a zero (forcing MySQL to do a conversion), and exhibit the results in a SELECT.

 SELECT t.value, t.value + 0 FROM test t

 t.value  t.value + 0
 -------  -----------
 1                  1
 1                  1
 1,2                1
 3                  3

It's documented in the MySQL Reference Manual somewhere, how MySQL does the conversion. At a risk of misstating what the manual says: MySQL reads the string character by character from left to right, until it encounters a character where it can no longer convert to numeric.

In the case if the string '1,2', that happens to be the comma character. That's where MySQL stops. So the conversion returns a numeric value of 1. You would be right to point out that other databases would throw an error attempting to do a conversion of that string to numeric. But MySQL doesn't throw an error or warning.

Reference: Type Conversion in Expression Evaluation http://dev.mysql.com/doc/refman/5.7/en/type-conversion.html

Basically, the predicate in your query is equivalent to specifying:

   WHERE value + 0 = 1

Which forces a conversion of the contents of the column value to numeric, and then a comparison to the numeric literal.

That's why the third row is being returned.

To get a different result, consider comparing to a string literal

   WHERE value = '1'