Auto-generating schema from Rails migrations

One of the things that has really bugged me with recent Typo development is the pain of maintaining 3 different database schema files (PostgreSQL, MySQL, SQLite) along with a set of Rails DB migration scripts. Every time we add a new table, we have to edit 4 different files, even though all of the information that we need is available in the migration file. Unfortunately, without the static schemas, new users would be adrift, so we’re stuck having to hand-modify each of the static schema files. This violates the DRY principle, causes errors, and irritates developers.

So I figured I’d fix it by writing some code that can take a set of Rails DB migrations, fold, spindle, and mutilate Rails itself, and then spit out a database-specific schema file showing all of the tables, indexes, and seed data provided by the migration files. This includes handling cases where a table is added in migration #4, two new fields are added in migration #6, and one field is deleted in migration #9. There are some corner cases that just can’t be handled, mostly relating to seed data that needs to be migrated to be correct with more recent schemas, but I think I can come close enough to make Typo happy, and probably a lot of other open-source Rails projects.

This turned out to be easier then I expected. I’ve put about 4 hours into it so far, and I can take this migration:

# This is db/migrate/4_test4.rb
class Test4 < ActiveRecord::Migration
  def self.up
    create_table :sidebars do |t|
      t.column :controller, :string
      t.column :active_position, :integer
      t.column :active_config, :text
      t.column :staged_position, :integer
      t.column :staged_config, :text
    end

    Sidebar.create(:active_position=>0, :controller=>'category')
    Sidebar.create(:active_position=>1, :controller=>'static')
    Sidebar.create(:active_position=>2, :controller=>'xml')
  end

  def self.down
    drop_table :sidebars
  end
end

And then do this:

$ irb
irb(main):001:0> require 'migrate'
=> true
irb(main):002:0> require 'db/migrate/4_test4' # the code above
=> true
irb(main):003:0> Test4.up
=> ...
irb(main):004:0> puts DBMigrator::Database.dump('postgresql')
CREATE TABLE sidebars (id serial primary key, controller character varying(255), active_position integer, active_config text, staged_position integer, staged_config text) ;
BEGIN;
INSERT INTO sidebars ("staged_position", "active_config", "active_position", "controller", "staged_config") VALUES(NULL, NULL, 0, 'category', NULL);
COMMIT;
BEGIN;
INSERT INTO sidebars ("staged_position", "active_config", "active_position", "controller", "staged_config") VALUES(NULL, NULL, 1, 'static', NULL);
COMMIT;
BEGIN;
INSERT INTO sidebars ("staged_position", "active_config", "active_position", "controller", "staged_config") VALUES(NULL, NULL, 2, 'xml', NULL);
COMMIT;

Postgres works now, at least with the 4 or 5 examples that I’ve swiped from Typo’s migrations. SQLite and MySQL are nearly working; I think I just need to fake out a couple classes each and they’ll be up and running. Once that’s done, I’ll bundle this all up into a Rails generator so people can do this:

$ ./script/generate schema postgresql
      create  db/schema.postgresql.sql
$ ./script/generate schema mysql
      create  db/schema.mysql.sql
$ ./script/generate schema sqlite
      create  db/schema.sqlite.sql

Posted by Scott Laird Thu, 01 Sep 2005 08:14:00 GMT


Comments

  1. Bob Aman about 10 hours later:

    Very nice. This is a great idea, not just for Typo, but for a lot of different things. I’ll probably try and incorporate something like this into FeedTools and TagTools.

  2. Nathaniel Talbott about 5 hours later:

    As I read this, it made me wonder: why have a schema at all? Why not just have migrations do the whole kit-n-caboodle, so they’ll take new users from no version to the current version, or upgrading users from their old version to the current version. The main concern I have with your current path is the corner cases and the fact that they may multiply over time.

    Just a thought.

  3. Scott Laird about 6 hours later:

    Working without a schema and letting the DB generate everything from migrations will work, but I really like having a single file that I can look at that will tell me about each of the classes that ActiveRecord builds for me–I’d rather look at a file then drop into the DB and ask it. And trying to reconstruct the current state of a table by looking through a couple dozen migrations really sucks.

    Now, since my schema generator won’t work unless your migrations already describe how to migrate from an empty DB to a fully populated one, I guess this ends up being sort of moot–you can either tell users to run ‘rake migrate’ to build the DB or use the packaged schema file.

    All except for Sqlite, that is–migrations don’t work right with SQLite and Rails 0.13.1, while my code seems to be able to generate working sqlite schemas just fine.

  4. topfunky about 7 hours later:

    This is really useful. When testing with a team, it would be really useful to be able to generate this so both a complete schema and incremental changes can be generated.

  5. Bob Aman about 7 hours later:

    Yeah. Originally, when I wrote my FeedTools library, I tried to avoid having db files because it was packaged as a gem and I didn’t want the user to have to worry about it, but I’m wondering now if that wasn’t a mistake.

  6. PhilThompson 3 months later:

    If you have a table whose data is unlikely to change (for example a states table where the data might be stopped,started, on hold,completed etc) what’s the best way to populate this table? I would say it makes sense to put it in with the table creation in the migration file but I’m unsure of the syntax.

  7. PhilThompson 3 months later:

    Sorry being a numpty. If I’d read this article properly I’d have seen how to do it.

  8. PhilThompson 3 months later:

    Ok, now I’ve hit an interesting problem (sorry if this is a little off-topic but as I’ve started…:). I want to test that the base data that got inserted in the migration is there but it seems all data is removed in the test environment. Well I knew this but I was hoping migrations would get around that but now I mention it I can see why not. This is for manipulating production data and the test only clone the dev schema and not the data. It poses an interesting (slighty) problem. It would be nice to run some tests on a production database to ensure the migration went as planned (obviously without removing the data).