Chris Giddings Chris Giddings - 1 month ago 9
Ruby Question

Migrating legacy data to new model in rails

I'm getting started with Rails again and am running into a conundrum I find intimidating at the moment. I'm somewhat of a noob when it comes to working with databases, so please forgive me if this is fairly basic.

I have an older Rails app with a data model I no longer wish to conform to. The model should be deprecated in favor of a lighter, less complex one.

The older app is also very monolithic, so I'm trying to break it up into smaller service components.

So this leads me to my question, since it's generally frowned upon to use multiple databases from a single model… what would be the best method for translating data stored in the old model to my new model, one service at a time?

For example, let us suppose I have a user model in both the old and the new. In the old model, the user has many columns, not all of which should make it to the new model.

An example of this could be a change from a user being limited to a single address in the old model to being able to assign a one to many relationship where addresses are split off in their own model and simply referenced using a foreign key or something.

EDIT 1:

The goal ultimately is to siphon the data from the legacy model's database into the new model's database as easily as possible, one dataset at a time.

EDIT 2:

Originally posted from my mobile. Here are a couple examples which may help with suggestions.

OLD MODEL

create_table "brands", force: :cascade do |t|
t.string "name"
t.string "url"
t.string "logo"
t.boolean "verified"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
t.boolean "hidden", default: false
t.string "facebook_url"
t.string "twitter_handle"
t.string "pinterest_handle"
t.string "google_plus_url"
t.string "address_street1"
t.string "address_street2"
t.string "address_street3"
t.string "address_city"
t.string "address_state"
t.string "address_zip"
t.string "address_country"
t.string "email"
t.string "phone"
t.string "story_title"
t.text "story_text"
t.string "story_photo"
end


NEW MODEL

create_table "companies", force: :cascade do |t|
t.string "companyName", null: false
t.string "companyURL", null: false
t.boolean "companyIsActive", default: true, null: false
t.boolean "companyDataIsVerified", default: false, null: false
t.string "companyLogoFileURL"
t.datetime "companyFoundedOnDate"
t.integer "companyHQLocationID"
t.integer "companyParentCompanyID"
t.integer "companyFirstSuggestedByID"
t.string "companyFacebookURL"
t.string "companyGooglePlusURL"
t.string "companyInstagramURL"
t.string "companyPinterestURL"
t.string "companyTwitterURL"
t.string "companyStoryTitle"
t.text "companyStoryContent"
t.string "companyStoryImageFileURL"
t.boolean "companyIsHiddenFromIndex", default: false, null: false
t.integer "companyDataScraperID"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
end


So, basically... I want to be able to take data from the old model, say a brands "name" column and siphon its related values to the new model so the value ends up in the companies "companyName" column of a totally different postgresql instance.

Answer

Having done this multiple times, I can tell you that the easiest thing to do is to create a simple rake task that iterates the first collection and creates items in the new collection.

There is no need to use anything like DataMapper. You already have ActiveRecord and can simply define which DB connection to use for each model.

In your config/database.yml:

brand_database:
  adapter: postgresql
  host: brand_host
  username: brand_user
  password: brand_pass
  database: brand_db

company_database:
  adapter: postgresql
  host: company_host
  username: company_user
  password: company_pass
  database: company_db

In your models:

class Brand < ActiveRecord::Base
  establish_connection :brand_database
end

class Company < ActiveRecord::Base
  establish_connection :company_database
end

In your new rake task (lib/tasks/db.rake):

# lib/tasks/db.rake
namespace :db do
  desc "Migrate brand records to company records"
  task :migrate_brands_to_companies, [] => :environment do
    Brand.find_each do |brand|
      Company.find_or_initialize_by(companyName: brand.name) do |company|
        puts "\n\tCreating Company record for #{brand.name}"
        company.companyURL               = brand.url
        company.companyLogoFileURL       = brand.logo
        company.companyTwitterURL        = "https://twitter.com/#{brand.twitter_handle}"
        company.companyIsHiddenFromIndex = brand.hidden
        company.created_at               = brand.created_at
        company.updated_at               = brand.updated_at
        company.save!
      end
    end
  end
end

Finally, run the rake task:

$ rake db:migrate_brands_to_companies

I need to say this: Rails is built using a solid convention. Failing to adhere to that convention will cause problems and additional expense, everytime. I have seen this many many times. Every single time I have seen someone deviate from that convention, they run into far more trouble than they would have ever expected. They break a lot of the "Rails magic".