Dave Jarvis Dave Jarvis - 4 months ago 9
SQL Question

Migrate from MySQL to PostgreSQL on Linux (Kubuntu)

A long time ago on a system far, far away...



Trying to migrate a database from MySQL to PostgreSQL. All the documentation I have read covers, in great detail, how to migrate the structure. I have found very little documentation on migrating the data. The schema has 13 tables (which have been migrated successfully) and 9 GB of data.

MySQL version: 5.1.x

PostgreSQL version: 8.4.x

I want to use the R programming language to analyze the data using SQL select statements; PostgreSQL has PL/R, but MySQL has nothing (as far as I can tell).

A New Hope



Create the database location (
/var
has insufficient space; also dislike having the PostgreSQL version number everywhere -- upgrading would break scripts!):


  1. sudo mkdir -p /home/postgres/main

  2. sudo cp -Rp /var/lib/postgresql/8.4/main /home/postgres

  3. sudo chown -R postgres.postgres /home/postgres

  4. sudo chmod -R 700 /home/postgres

  5. sudo usermod -d /home/postgres/ postgres



All good to here. Next, restart the server and configure the database using these installation instructions:


  1. sudo apt-get install postgresql pgadmin3

  2. sudo /etc/init.d/postgresql-8.4 stop

  3. sudo vi /etc/postgresql/8.4/main/postgresql.conf

  4. Change
    data_directory
    to
    /home/postgres/main

  5. sudo /etc/init.d/postgresql-8.4 start

  6. sudo -u postgres psql postgres

  7. \password postgres

  8. sudo -u postgres createdb climate

  9. pgadmin3



Use
pgadmin3
to configure the database and create a schema.

The episode continues in a remote shell known as
bash
, with both databases running, and the installation of a set of tools with a rather unusual logo: SQL Fairy.


  1. perl Makefile.PL

  2. sudo make install

  3. sudo apt-get install perl-doc
    (strangely, it is not called
    perldoc
    )

  4. perldoc SQL::Translator::Manual



Extract a PostgreSQL-friendly DDL and all the
MySQL
data:


  1. sqlt -f DBI --dsn dbi:mysql:climate --db-user user --db-password password -t PostgreSQL > climate-pg-ddl.sql

  2. Edit
    climate-pg-ddl.sql
    and convert the identifiers to lowercase, and insert the schema reference (using VIM):

    • :%s/"\([A-Z_]*\)"/\L\1/g

    • :%s/ TABLE / TABLE climate./g

    • :%s/ on / on climate./g


  3. mysqldump --skip-add-locks --complete-insert --no-create-db --no-create-info --quick --result-file="climate-my.sql" --databases climate --skip-comments -u root -p



It might be worthwhile to simply rename the tables and columns in MySQL to lowercase:


  1. select concat( 'RENAME TABLE climate.', TABLE_NAME, ' to climate.', lower(TABLE_NAME), ';' ) from INFORMATION_SCHEMA.TABLES where TABLE_SCHEMA='climate';

  2. Execute the commands from the previous step.

  3. There is probably a way to do the same for columns; I changed them manually because it was faster than figuring out how to write the query.



The Database Strikes Back



Recreate the structure in PostgreSQL as follows:


  1. pgadmin3
    (switch to it)

  2. Click the Execute arbitrary SQL queries icon

  3. Open
    climate-pg-ddl.sql

  4. Search for
    TABLE "
    replace with
    TABLE climate."
    (insert the schema name
    climate
    )

  5. Search for
    on "
    replace with
    on climate."
    (insert the schema name
    climate
    )

  6. Press
    F5
    to execute



This results in:

Query returned successfully with no result in 122 ms.


Replies of the Jedi



At this point I am stumped.


  • Where do I go from here (what are the steps) to convert
    climate-my.sql
    to
    climate-pg.sql
    so that they can be executed against PostgreSQL?

  • How to I make sure the indexes are copied over correctly (to maintain referential integrity; I don't have constraints at the moment to ease the transition)?

  • How do I ensure that adding new rows in PostgreSQL will start enumerating from the index of the last row inserted (and not conflict with an existing primary key from the sequence)?

  • How do you ensure the schema name comes through when transforming the data from MySQL to PostgreSQL inserts?



Resources



A fair bit of information was needed to get this far:



Thank you!

Answer

What I usually do for such migrations is two-fold:

  • Extract the whole database definition from MySQL and adapt it to PostgreSQL syntax.
  • Go over the database definition and transform it to take advantage of functionality in PostgreSQL that doesn't exist in MySQL.

Then do the conversion, and write a program in whatever language you are most comfortable with that accomplishes the following:

  • Reads the data from the MySQL database.
  • Performs whatever transformation is necessary on the data to be stored in the PostgreSQL database.
  • Saves the now-transformed data in the PostgreSQL database.

Redesign the tables for PostgreSQL to take advantage of its features.

If you just do something like use a sed script to convert the SQL dump from one format to the next, all you are doing is putting a MySQL database in a PostgreSQL server. You can do that, and there will still be some benefit from doing so, but if you're going to migrate, migrate fully.

It will involve a little bit more up-front time spent, but I have yet to come across a situation where it isn't worth it.

Comments