Arek S Arek S - 28 days ago 10
Python Question

How do I execute inserts and updates in an Alembic upgrade script?

I need to alter data during an Alembic upgrade.

I currently have a 'players' table in a first revision:

def upgrade():
op.create_table('player',
sa.Column('id', sa.Integer(), nullable=False),
sa.Column('name', sa.Unicode(length=200), nullable=False),
sa.Column('position', sa.Unicode(length=200), nullable=True),
sa.Column('team', sa.Unicode(length=100), nullable=True)
sa.PrimaryKeyConstraint('id')
)


I want to introduce a 'teams' table. I've created a second revision:

def upgrade():
op.create_table('teams',
sa.Column('id', sa.Integer(), nullable=False),
sa.Column('name', sa.String(length=80), nullable=False)
)
op.add_column('players', sa.Column('team_id', sa.Integer(), nullable=False))


I would like the second migration to also add the following data:


  1. Populate teams table:

    INSERT INTO teams (name) SELECT DISTINCT team FROM players;

  2. Update players.team_id based on players.team name:

    UPDATE players AS p JOIN teams AS t SET p.team_id = t.id WHERE p.team = t.name;



How do I execute inserts and updates inside the upgrade script?

Answer

What you are asking for is a data migration, as opposed to the schema migration that is most prevalent in the Alembic docs.

This answer assumes you are using declarative (as opposed to class-Mapper-Table or core) to define your models. It should be relatively straightforward to adapt this to the other forms.

Note that Alembic provides some basic data functions: op.bulk_insert() and op.execute(). If the operations are fairly minimal, use those. If the migration requires relationships or other complex interactions, I prefer to use the full power of models and sessions as described below.

The following is an example migration script that sets up some declarative models that will be used to manipulate data in a session. The key points are:

  1. Define the basic models you need, with the columns you'll need. You don't need every column, just the primary key and the ones you'll be using.
  2. Within the upgrade function, use op.get_bind() to get the current connection, and make a session with it.
  3. Use the models and session as you normally would in your application.

"""create teams table

Revision ID: 169ad57156f0
Revises: 29b4c2bfce6d
Create Date: 2014-06-25 09:00:06.784170
"""

revision = '169ad57156f0'
down_revision = '29b4c2bfce6d'

from alembic import op
from flask_sqlalchemy import _SessionSignalEvents
import sqlalchemy as sa
from sqlalchemy import event
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, Session as BaseSession, relationship

Session = sessionmaker()

event.remove(BaseSession, 'before_commit', _SessionSignalEvents.session_signal_before_commit)
event.remove(BaseSession, 'after_commit', _SessionSignalEvents.session_signal_after_commit)
event.remove(BaseSession, 'after_rollback', _SessionSignalEvents.session_signal_after_rollback)

Base = declarative_base()


class Player(Base):
    __tablename__ = 'players'

    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String, nullable=False)
    team_name = sa.Column('team', sa.String, nullable=False)
    team_id = sa.Column(sa.Integer, sa.ForeignKey('teams.id'), nullable=False)

    team = relationship('Team', backref='players')


class Team(Base):
    __tablename__ = 'teams'

    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String, nullable=False, unique=True)


def upgrade():
    bind = op.get_bind()
    session = Session(bind=bind)

    # create the teams table and the players.team_id column
    Team.__table__.create(bind)
    op.add_column('players', sa.Column('team_id', sa.ForeignKey('teams.id'), nullable=False)

    # create teams for each team name
    teams = {name: Team(name=name) for name in session.query(Player.team).distinct()}
    session.add_all(teams.values())

    # set player team based on team name
    for player in session.query(Player):
        player.team = teams[player.team_name]

    session.commit()

    # don't need team name now that team relationship is set
    op.drop_column('players', 'team')


def downgrade():
    bind = op.get_bind()
    session = Session(bind=bind)

    # re-add the players.team column
    op.add_column('players', sa.Column('team', sa.String, nullable=False)

    # set players.team based on team relationship
    for player in session.query(Player):
        player.team_name = player.team.name

    session.commit()

    op.drop_column('players', 'team_id')
    op.drop_table('teams')

The event.remove lines are to deal with Flask-SQLAlchemy. The extension adds some events to the session, but since the migration happens outside of the app context, these events produce exceptions. If you are not using that extension or are using a version >=2.0 (not released yet), you don't need those lines (or the related imports).

Comments