Scott Crooks Scott Crooks - 6 months ago 12
SQL Question

MySQL convert to UTF8 without structure change

I have a rather large database that I am trying to convert from charset and collation

latin1/latin1_swedish_ci
to
utf8mb4/utf8mb4_unicode_ci
. I am hoping to setup replication to a slave, run the conversion, and then promote the slave when finished as to avoid downtime.

I noticed that when running the query...

ALTER TABLE `sometable` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;


...MySQL automatically converts
text
to
mediumtext
or
mediumtext
to
longtext
, etc.

Is there a way to turn this feature off? It is nice that MySQL has this feature, but the problem is that it breaks replication because the structure of the tables on the slave is different from master.

Answer

As documented under ALTER TABLE Syntax:

For a column that has a data type of VARCHAR or one of the TEXT types, CONVERT TO CHARACTER SET will change the data type as necessary to ensure that the new column is long enough to store as many characters as the original column. For example, a TEXT column has two length bytes, which store the byte-length of values in the column, up to a maximum of 65,535. For a latin1 TEXT column, each character requires a single byte, so the column can store up to 65,535 characters. If the column is converted to utf8, each character might require up to three bytes, for a maximum possible length of 3 × 65,535 = 196,605 bytes. That length will not fit in a TEXT column's length bytes, so MySQL will convert the data type to MEDIUMTEXT, which is the smallest string type for which the length bytes can record a value of 196,605. Similarly, a VARCHAR column might be converted to MEDIUMTEXT.

To avoid data type changes of the type just described, do not use CONVERT TO CHARACTER SET. Instead, use MODIFY to change individual columns. For example:

ALTER TABLE t MODIFY latin1_text_col TEXT CHARACTER SET utf8;
ALTER TABLE t MODIFY latin1_varchar_col VARCHAR(M) CHARACTER SET utf8;
Comments