Migrating in two dimensions

Posted by Scott Laird Mon, 31 Oct 2005 16:32:42 GMT

This seems to be the season for talking about Rails migrations. A lot of people are finally discovering them and finding that they’re very useful for maintaining your database schema over time. I’m a big fan of Rails migrations; we’ve been using them with Typo since the middle of July, when they were all new and shiny. We’re currently up to 24 migrations in the Typo source tree. We’re even using migrations to create our initial database, via my Schema Generator. I haven’t done a formal survey, but I suspect that Typo is the biggest open-source user of migrations, and may actually be the biggest user overall.

The big problem is that we’ve been using migrations wrong the whole time, and we just realized it.

There are probably a dozen bugs in Typo’s bug tracker that boil down to “I fell behind the trunk and now rake migrate throws exceptions and I can’t upgrade anymore.” The problem is that migrations are designed to run against an earlier version of your database, but they use the current version of your code. The first time that this caused problems was with the migration from Typo 2.0 to 2.5–we’d added two new fields to articles. Migration number 7 added the permalink field and a before_save hook to make sure that all saved articles have permalinks. Then migration number 9 added GUIDs and a second before_save hook to fill the guid field. Both migrations did Articles.find(:all).each { |a| a.save } to update each Article and populate the new fields.

This worked great for developers who frequently upgraded. A few days after the GUID migration went in, though, we started getting weird bug reports–users who tried to do both upgrades at the same time found that migration number 7 was dying. What was happening was that migration number 7 added the new permalink field to articles, but when it went to run the save loop both before_save hooks ran, and Typo tried to add a GUID to each article. However, the guid field didn’t exist yet, so the migration threw a bunch of exceptions and died.

This caused a bunch of grumbling on the Typo IRC channel. We threw around a bunch of possible fixes. Our favorite was separating migrations into two parts–a schema change part and a data change part. First we’d run all of the schema changes, and then update all of the data. As a work-around, we added a hack that checked the current schema version and disabled specific before_save filters for older versions.

We managed to keep this little bandaid working until a couple weeks ago, when a huge set of new migrations went it; they renamed the articles table and merged several other tables into the new contents table using STI. And, again, we found that older migrations broke when users tried to upgrade from Typo 2.5.6 to the current dev tree. Unlink the permalink/guid case, this time there was no simple workaround. We couldn’t just add a couple if statements in a filter and make it all go away.

The fundamental problem is that we were using the wrong mental model for migrations. I saw migrations as a one-dimensional thing–a list of steps for migrating old data into the new format. In this view, the migration for going from schema version 6 to schema version 7 is constant–once it’s been written, the only reason to change it is if a bug turns up in the logic for that migration. Otherwise, the migration code should remain unchanged over time.

And that’s the problem–migrations aren’t one-dimensional. They are (and need to be) two dimensional–the schema version is one dimension and the code version is the other. Individual migrations exist to migrate from a specific old schema version to the current version, using the current code. Each migration should change over time to adapt to the changes in the code. So, the right fix for the permalink migration that caused so many problems wasn’t to add a bunch of logic to before_save. Instead, we should have deleted the entire save loop from the migration, and trusted the GUID migration to update both fields. If that wasn’t good enough, then we should have added a new migration at the end to do permalink cleanup after the GUIDs were added.

Once I came to grips with this, the migration changes needed to allow 2.5.x users to upgrade to the current trunk were pretty simple, and took about 5 minutes to write and test.

Or was I the only person in the Rails universe who thought about migrations this way?

Tags , , , ,  | 5 comments

Auto-generating schema from Rails migrations

Posted by Scott Laird Thu, 01 Sep 2005 08:14:00 GMT

One of the things that has really bugged me with recent Typo development is the pain of maintaining 3 different database schema files (PostgreSQL, MySQL, SQLite) along with a set of Rails DB migration scripts. Every time we add a new table, we have to edit 4 different files, even though all of the information that we need is available in the migration file. Unfortunately, without the static schemas, new users would be adrift, so we’re stuck having to hand-modify each of the static schema files. This violates the DRY principle, causes errors, and irritates developers.

So I figured I’d fix it by writing some code that can take a set of Rails DB migrations, fold, spindle, and mutilate Rails itself, and then spit out a database-specific schema file showing all of the tables, indexes, and seed data provided by the migration files. This includes handling cases where a table is added in migration #4, two new fields are added in migration #6, and one field is deleted in migration #9. There are some corner cases that just can’t be handled, mostly relating to seed data that needs to be migrated to be correct with more recent schemas, but I think I can come close enough to make Typo happy, and probably a lot of other open-source Rails projects.

This turned out to be easier then I expected. I’ve put about 4 hours into it so far, and I can take this migration:

# This is db/migrate/4_test4.rb
class Test4 < ActiveRecord::Migration
  def self.up
    create_table :sidebars do |t|
      t.column :controller, :string
      t.column :active_position, :integer
      t.column :active_config, :text
      t.column :staged_position, :integer
      t.column :staged_config, :text
    end

    Sidebar.create(:active_position=>0, :controller=>'category')
    Sidebar.create(:active_position=>1, :controller=>'static')
    Sidebar.create(:active_position=>2, :controller=>'xml')
  end

  def self.down
    drop_table :sidebars
  end
end

And then do this:

$ irb
irb(main):001:0> require 'migrate'
=> true
irb(main):002:0> require 'db/migrate/4_test4' # the code above
=> true
irb(main):003:0> Test4.up
=> ...
irb(main):004:0> puts DBMigrator::Database.dump('postgresql')
CREATE TABLE sidebars (id serial primary key, controller character varying(255), active_position integer, active_config text, staged_position integer, staged_config text) ;
BEGIN;
INSERT INTO sidebars ("staged_position", "active_config", "active_position", "controller", "staged_config") VALUES(NULL, NULL, 0, 'category', NULL);
COMMIT;
BEGIN;
INSERT INTO sidebars ("staged_position", "active_config", "active_position", "controller", "staged_config") VALUES(NULL, NULL, 1, 'static', NULL);
COMMIT;
BEGIN;
INSERT INTO sidebars ("staged_position", "active_config", "active_position", "controller", "staged_config") VALUES(NULL, NULL, 2, 'xml', NULL);
COMMIT;

Postgres works now, at least with the 4 or 5 examples that I’ve swiped from Typo’s migrations. SQLite and MySQL are nearly working; I think I just need to fake out a couple classes each and they’ll be up and running. Once that’s done, I’ll bundle this all up into a Rails generator so people can do this:

$ ./script/generate schema postgresql
      create  db/schema.postgresql.sql
$ ./script/generate schema mysql
      create  db/schema.mysql.sql
$ ./script/generate schema sqlite
      create  db/schema.sqlite.sql

Tags , , , , , ,  | 8 comments

Rails 0.13

Posted by Scott Laird Thu, 07 Jul 2005 03:28:00 GMT

It looks like Rails 0.13 has been released. There’s a lot of nifty-looking stuff in there, but the one that really grabs my eye is the new migration facility. Rails now has the ability to make schema changes when you upgrade your app. So, when version 1.1 of your app comes along, and it needs to add an eye_color field to the user table, you can now handle it entirely within Rails:

class AddEyeColor < ActiveRecord::Migration
  def self.up
    add_column :users, :eye_color, :string
  end

  def self.down
    remove_column :users, :eye_color
  end
end

The nice thing is that this is completely database-independent. It’ll work on both MySQL and PostgreSQL now, and other databases as soon as their Rails drivers are updated.

One possible shortcoming with migrations: I don’t see any obvious way to enforce the correct order of migrations–if version 1.2 adds a pets table, and version 1.3 adds a nickname field to pets, then I don’t see how the upgrade process from 1.1 to 1.3 will know which order to apply the two migration scripts.

Also, the migration code seems to be a nice first start on database-independent schemas. I suspect that a near-future Rails revision will allow developers to specify the schema as a set of migrations, and then the migration mechanism will create all of the tables needed automatically. That’ll be another huge step for Rails usability.

Posted in  | Tags , , ,  | 2 comments