lonewarrior556 lonewarrior556 - 6 months ago 17
SQL Question

Dangers of Changing encoding in mysql

I have an app that collected a sizeable amount of data over the last couple of years, the data is stored in many mysql tables; However all the table were set to

Encoding cp1252 West European (latin1)
Collation: latin1_swedish_ci


There are special chars that do not get displayed properly; I would like to change to

Encoding: UTF-8 Unicode (utf8mb4)
Collation: utf8mb4_general_ci


I know how to alter the tables but, I do have many tables with ~a million rows of data, what are the dangers in changing all the tables' encodings?

Aside from backing up everything, can I do anything to minimize the risk?

Answer

If you don't have any character columns as foreign keys, the conversion is safe, as all characters from the first one are present in the second, but with utf8mb4_general_ci you might have changes in ordering when sorting rows by character columns. If that's a problem for you, try using utf8mb4_unicode_ci instead.

If you have foreign keys, you should disable foreign_key_checks, and only enable it back after converting all tables.

Also, the alter table will probably lock the table and create a copy, so if you're doing this on a production server, be prepared for some downtime.

Comments