Raymon Raymon - 1 month ago 26
MySQL Question

What is the difference between utf8mb4 and utf8 charsets in mysql?

What is the difference between utf8mb4 and utf8 charsets in mysql?


I already know about ASCII, UTF-8, UTF-16 and UTF-32 encodings;
but I'm curious to know whats the difference of 'utf8mb4' group of encodings with other encoding types defined in mysql server.

Are there any special benefits/proposes of using utf8mb4 rather than utf8?

Answer

Documentation:

The character set named utf8 uses a maximum of three bytes per character and contains only BMP characters. As of MySQL 5.5.3, the utf8mb4 character set uses a maximum of four bytes per character supports supplemental characters:

  • For a BMP character, utf8 and utf8mb4 have identical storage characteristics: same code values, same encoding, same length.

  • For a supplementary character, utf8 cannot store the character at all, while utf8mb4 requires four bytes to store it. Since utf8 cannot store the character at all, you do not have any supplementary characters in utf8 columns and you need not worry about converting characters or losing data when upgrading utf8 data from older versions of MySQL.

So it's for storing characters lying outside the Basic Multilingual Plane, see also Comparison of Unicode encodings.