SST SST - 7 months ago 14
SQL Question

How to filter a column with Non-accented characters using select query

I have a MySQL table (test) with utf-8 charset encoding. There are three entries, two entries with normal characters and another name with accent characters.

CREATE TABLE test (
id Integer,
name VARCHAR(50),
PRIMARY KEY (id)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

INSERT INTO `test` (`id`, `name`) VALUES (1, 'aaaa');
INSERT INTO `test` (`id`, `name`) VALUES (2, 'AAAA');
INSERT INTO `test` (`id`, `name`) VALUES (3, 'áááá');


If I run the following select query, it returns all the 3 entries

Actual Result:-

select * from test where name like '%aa%';

id | name
----|----
1 | aaaa
2 | AAAA
3 | áááá


Instead of that, it should be return last entry with id=3.

I don't want to use 'BINARY' OR 'COLLATE utf8_bin' because it returns only case sensitive search.

I need normal search with string like query, e.g:-

Expected Result:-

select * from test where name like '%aa%';

id | name
---|-----
1 | aaaa
2 | AAAA

Answer

The utf8_bin collation is what you need for your requirement to handle accents

I don't want to use 'BINARY' OR 'COLLATE utf8_bin' because it returns only case sensitive search.

This is easier (and more performant) to solve with utf8_bin than solving the accent issue with another collation

SELECT * FROM test WHERE LOWER(name) like '%aa%' COLLATE utf8_bin

-> added after comments

The query above assumes that the query parameters are minuscule but if you cant modify the params to always be minuscules then you can also use this variation

SELECT * FROM test WHERE LOWER(name) like LOWER('%ÚÙ%') COLLATE utf8_bin