Memory leak profiling with Rails

One of my long-running problems with Rails (and Ruby in general) is that it’s difficult to debug memory leaks. I’ve had a number of cases where I’ve stuck something into a long-lived array or hash and discovered much later that my Ruby process was eating over 100 MB of RAM. While ps makes it easy to see when Ruby’s using lots of RAM, actually figuring out where it went is a lot harder.

Several people have been working on memory leak debuggers for Rails, and for Typo in general, including Steve Longdo, but I didn’t have a lot of luck actually finding leaks with their tools. I asked the Seattle Ruby Group for help, and Ryan Davis gave me a quick little memory leak spotter that he uses. I made a few additions to it, and it helped me discover that my Typo development tree was leaking 1-3 strings per hit, but it didn’t help me figure out where the leak was happening. After playing with a few options, I settled on dumping all strings to a file once per memory profiler loop, and then I diffed the files that showed my problem. It took about 15 seconds to discover a bug in my route cache code, and about 30 seconds more to fix it.

I’ll package this up as a Rails plugin eventually, but I thought it might be worth sharing here for now. Just load this code and then call MemoryProfiler.start. By default it logs a record of the 20 classes with the biggest changes over the last 10 seconds; you can change the cycle speed by adding :delay => 20 to the start command, and you can dump all strings to a file on each loop by adding :string_debug => true. Don’t leave string debugging on for too long; it’ll eat a ton of disk space.

class MemoryProfiler
  DEFAULTS = {:delay => 10, :string_debug => false}

  def self.start(opt={})
    opt = DEFAULTS.dup.merge(opt)

    Thread.new do
      prev = Hash.new(0)
      curr = Hash.new(0)
      curr_strings = []
      delta = Hash.new(0)

      file = File.open('log/memory_profiler.log','w')

      loop do
        begin
          GC.start
          curr.clear

          curr_strings = [] if opt[:string_debug]

          ObjectSpace.each_object do |o|
            curr[o.class] += 1 #Marshal.dump(o).size rescue 1
            if opt[:string_debug] and o.class == String
              curr_strings.push o
            end
          end

          if opt[:string_debug]
            File.open("log/memory_profiler_strings.log.#{Time.now.to_i}",'w') do |f|
              curr_strings.sort.each do |s|
                f.puts s
              end
            end
            curr_strings.clear
          end

          delta.clear
          (curr.keys + delta.keys).uniq.each do |k,v|
            delta[k] = curr[k]-prev[k]
          end

          file.puts "Top 20"
          delta.sort_by { |k,v| -v.abs }[0..19].sort_by { |k,v| -v}.each do |k,v|
            file.printf "%+5d: %s (%d)\n", v, k.name, curr[k] unless v == 0
          end
          file.flush

          delta.clear
          prev.clear
          prev.update curr
          GC.start
        rescue Exception => err
          STDERR.puts "** memory_profiler error: #{err}"
        end
        sleep opt[:delay]
      end
    end
  end
end

As usual, the good bits are Ryan’s, and the bad bits are mine.

Posted by Scott Laird Fri, 18 Aug 2006 04:57:38 GMT


Typo 4.1 begins to stir

I’m not quite done releasing Typo 4.0.x builds yet, but I’ve already started working on code for the next major Typo release. My big goals for 4.1 are speed and cleanliness. Typo still has some cruft buried in it from Rails 0.12.x, and there are a number of subsystems that have been partially refactored and rewritten several times. I want to clean all of that up and make Typo as fast as possible while reducing its memory footprint.

One of the problems that we have in Typo is text filters–our text filtering system includes several components that need to generate URLs that point back to the current blog using url_for, but url_for requires a request object to allow it to find the current base URL. Part of the run-up to Typo 4.0 included the addition of a canonical_server_url configuration field that is auto-populated on blog creation, but we weren’t really using it for anything yet.

Starting with 4.1, we’re going to be cheating and using canonical_server_url to generate most of our URLs. This has a lot of big advantages. First, we can get rid of the whole text-filters-are-controllers problem, because we can use this_blog.url_for to generate URLs. Second, I’ve added permalink_url methods to most models, using Blog.url_for. This has let me deprecate dozens of helpers and remove cruft from all of the tree. The third advantage is that generated URLs will be stable–it doesn’t matter if people go to scottstuff.net or www.scottstuff.net, all of the links will point to http://scottstuff.net either way. Finally, I’m actually caching calls to Blog.url_for–if you ask for the same page more then once during the lifetime of your dispatch process, then the second call should be nearly instant. Stefan Kaes keeps pointing out that routes are one of the slowest parts of Rails; hopefully this will help our performance.

I’ve also deprecated boatloads of helpers. I think we had 6 or 7 different ways of generating an article permalink. They’re all gone now, replaced by article.permalink_url. I created my own deprecation tool–just add a call to typo_deprecated at the top of each deprecated method and it’ll print a warning the first time it’s called in production or development mode. In test mode, deprecated methods will throw exceptions every time they’re called.

My current patch touches 117 files an has over 1000 lines of changes. I’ve made a lot of progress in cleaning up Typo, but there’s still a lot of work left to go.

I’m planning on releasing Typo 4.0.3 later this week. Once that’s out, I’m going to create a 4.0.x branch and start adding my new code to the trunk.

Posted by Scott Laird Wed, 16 Aug 2006 05:05:26 GMT


Easy Backups for Rails

I mentioned a few weeks ago that I was trying to add generic database backups to my installer, so I can do database backups before each upgrade. I got backups working a couple releases ago, and they’re useful enough that I’ve extracted them so everyone can use them.

The latest installer release includes a pair of command-line tools that make it easy to use the installer’s backup and restore code without using the rest of the installer. Run rails-backup and it’ll produce a .yml file containing all of your data. Then run rails-restore BACKUPFILENAME and it’ll restore the backup. The dump and restore formats are database-independent, so it should be possible to use these tools to migrate between database engines, but this hasn’t been heavily tested yet.

One warning about the restore–rails-restore doesn’t restore the schema, just the data. So you’ll need a way to build a database with the right schema revision. Look in the .yml file to see which schema version you need (search for schema_info), and then run rake migrate VERSION=xx to move your database to the right schema revision.

To use these, just install the rails-app-installer gem, and they should appear in your path automagically. Report bugs to the rails-app-installer bug tracker.

Posted by Scott Laird Wed, 16 Aug 2006 04:36:14 GMT


Rails Application Installer 0.2.0

I just released version 0.2.0 of my rails-app-installer tool.

This is the installer that I created for Typo, extracted into its own package so other Rails apps can use it as well. The installer lets users install Typo with only two commands:

  $ sudo gem install typo
  $ typo install /some/path

With a little bit of work, you can get your app to be just as easy to install. New in 0.2.0 is a rails-app-installer-setup command that will do most of the work for you. Just cd to your project and run rails-app-installer-setup my-app, and it’ll create a bin/my-app installer for you, along with some config files and a lib/tasks/release.rake file that knows how to build a .gem for you. Just follow the directions that rails-app-installer-setup gives you and you’ll have a .gem in no time at all.

Changes for version 0.2.0:

  • Added a rails-app-installer-setup command to help set up new apps.
  • Made the installer fetch its default version of Rails from the application .gem dependency, instead of making developers repeat themselves.
  • Fixed a restore bug that kept IDs from being restored correctly.
  • Added a command-line backup and restore tool that can be used with any Rails app. I’ll talk about it more in another post.

At this point, it’s nearly complete. It needs better documentation, and I’d like to get a bit of feedback from other apps before I call it 1.0, but my to-do list is getting pretty short.

Posted by Scott Laird Wed, 16 Aug 2006 04:19:28 GMT


How to install gems when you're not root

A number of people have had a hard time installing Typo from the .gem on hosting providers’ systems where they don’t have the ability to install things as root. So, I sat down with Jim Weirich at OSCON to talk over the best way to handle non-root installs. Apparently RubyGems has support for this, plus or minus a few bugs. Try this:

  $ export GEM_PATH=~/gems
  $ gem install -i ~/gems typo
  $ ~/gems/bin/typo install /some/path

Rubygems 0.9.0 will end up re-installing .gems that are already installed in the system gem directory, but that’s not fatal. It’s just a bit slower then it should be.

Hopefully the next rubygems release will fix this and add a “you aren’t running as root, use -i” warning.

Posted by Scott Laird Fri, 28 Jul 2006 19:50:40 GMT


Rail Application Installer

I’ve mostly finished extracting Typo’s installer into its own Rails project and .gem. The installer makes it trivial to build an installable .gem for any Rails project, so the install process looks like this:

$ gem install my-project
$ my-project install /some/path

The installer source lives in Google Code. That includes a mailing list and bug tracker. I’ll upload my .gem to RubyForge later today, along with some documentation. For now, the rdoc is available.

Here’s what you’ll need to do to add the installer to your existing Rails app:

  1. Create a .gem that depends on rails-app-installer, rails, and all other .gems that you need to have installed.
  2. Add an executable entry to your gemspec. If your app is called my-app, then add executable = ['my-app'].
  3. Finally, create bin/my-app, using one of the examples in the rails-app-installer SVN tree as an example.

Here’s a short example bin/my-app:

  #!/usr/bin/env ruby

  require 'rubygems'
  require 'rails-installer'

  class AppInstaller < RailsInstaller
    application_name 'my_app'
    support_location 'my website'
    rails_version '1.1.4'
  end

  # Installer program
  directory = ARGV[1]

  app = AppInstaller.new(directory)
  app.message_proc = Proc.new do |msg|
    STDERR.puts " #{msg}"
  end
  app.execute_command(*ARGV)

That’s all that’s needed–as long as the installer gem is installed, this will give you a full installer that supports installs, upgrades, db backups and restores, and all of the other things that the Typo installer currently provides. Adding application-specific installer subcommands is easy. Here’s the sweep_cache implementation from Typo’s installer:

  class SweepCache < RailsInstaller::Command
    help "Sweep Typo's cache"

    def self.command(installer, *args)
      installer.sweep_cache
    end
  end

That’s all that’s needed to implement the typo sweep_cache /some/path installer command.

Update: The gem is out. gem install rails-app-installer.

Posted by Scott Laird Fri, 28 Jul 2006 17:53:08 GMT


ActiveRecord to YAML serializer?

For the next version of my installer, I’d love a generic, DB-agnostic way to perform database backups under Rails. Ideally, I’d be able to serialize (and unserialize) an entire DB to YAML. That way the installer could easily perform backups for people before they upgrade, and the backups would be portable between databases. So someone could start out with SQLite and move to PostgreSQL or MySQL.

The problem is that I don’t actually want to have to write this myself, but I’m not having a lot of luck searching for one. It seems like a pretty obvious tool, though, so perhaps I’m just searching wrong. It shouldn’t be very hard to write the serializer, but the hard part will be getting the deserializer right–with Postgres, you’re going to have to play games to get the DB’s internal sequence numbers set right. Other databases will probably have similar (but different) issues.

Any pointers?

Posted by Scott Laird Tue, 25 Jul 2006 14:50:30 GMT


Typo installer

As I mentioned in the Typo 4.0.0 announcement, Typo now includes a .gem-based installer that makes it easy to install Typo. Just install the Typo gem (gem install typo) and run the Typo installer (typo install /some/path) to create a new Typo blog in /some/path. The installer will install all of Typo’s files, create a working set of config files, create a SQLite database for you, and start the Mongrel web server on a random TCP port. It’ll also create a set of sample Apache and Lighttpd configuration files to show you how to tie Typo into your existing web server. One warning: this will only work right if you already have SQLite 3 and SWIG installed on your system. If they’re missing, then you’ll get weird warnings and errors. SWIG is particularly strange–if it’s missing, then you’ll get sporadic test failures when trying to use SQLite.

The same installer can also be used for upgrades–if you’ve installed one of the Typo 3.99.x pre-releases, then you can upgrade the same way you installed Typo in the first place–run gem install typo to grab a newer Typo gem, and then typo install /some/path to upgrade. Typo will recognize the existing install, back up the database, shut down the existing Mongrel server, install new files, upgrade the database, and restart Mongrel.

Once Typo is installed you can test it by connecting directly to Mongrel with your web browser; the installer will display the URL for you. Normally, for production use, you’d configure some sort of proxy or load balancer (like Apache’s mod_proxy) in front of Mongrel, so users talk to Apache and Apache talks to Mongrel. The installer creates a number of example configs in the installer/ directory. Once thing to be careful about–you’ll need to make sure that Mongrel and Typo are restarted when your web server reboots. You can start them by running typo start /some/path. You’ll need to talk to your system administrator or hosting provider to learn the best way to start Typo on boot.

Compared to the half-dozen mutually contradictory install guides that existed before, this is a big step forward. However, not everyone wants to (or can) run Typo under Mongrel with SQLite. Some hosting environments make HTTP proxying difficult, while others would rather use a “real” database. So, in the Rails spirt of convention over configuration, I built the installer to use Mongrel and SQLite by default, but you can configure it for your favorite database with a bit of extra work. There are a number of configuration settings that control the installer’s behavior. The typo config /some/path command will show existing variables. You can change them via typo config /some/path var=value.

As of Typo 4.0.0, the installer knows about 6 different configuration variables:

  • web-server: which web server technology Typo will use. It defaults to mongrel. Other options are mongrel_cluster and external. If you want to use FastCGI, then set web-server to external.
  • threads: if web-server is set to mongrel_cluster, then threads controls how many Mongrel back ends are used.
  • port-number: which TCP port Mongrel listens on. This defaults to a random number between 4000 and 5000. The mongrel_cluster server uses one TCP port per thread, starting with port-number and counting up.
  • url-prefix: if Mongrel 0.3.13.4 or higher is installed, then url-prefix can be used to move Typo into a subdirectory. If you want to run Typo on http://www.example.com/blog, then you’ll need to set url-prefix to /blog.
  • bind-address: which IP address Mongrel binds to.
  • database: which database server Typo will use. The default is sqlite. If you change this, then the installer won’t create a SQLite database for Typo or try to back the SQLite database up during upgrades.

So, if you want to use the Typo installer with FastCGI and Mysql, then you’ll want to do this:

  $ typo config /some/path web-server=external database=mysql

You’ll also need to edit database.yml and create your own database. There are schema files in db/schema.*.sql for several different databases. Pick the one that matches your database.

The typo command supports 7 sub-commands:

  • install [version] [config=value ...]. Installs or upgrades Typo. You can optionally specify which version to install, if you have multiple Typo .gems installed. You can also use the installer to install directly out of a Subversion checkout by specifying version cwd and running typo from inside of the Subversion directory.
  • start. Starts the Mongrel or mongrel_cluster webserver. If Mongrel has been disabled via web-server=external, then this command does nothing.
  • run. Just like starts, but runs Mongrel in the foreground when possible.
  • stop. Stops Mongrel. Like start, it is ignored if Mongrel has been disabled.
  • restart. Stops and restarts Mongrel.
  • config [name=[value] ...]. Without parameters, it shows Typo’s current configuration. With parameters, it sets the configuration parameters. If you specify name= without a value, then it clears the variable.
  • sweep_cache. Sweeps Typo’s cache. This can be useful for troubleshooting.

That should be all that you need to know to install Typo and keep it running. Any questions?

Posted by Scott Laird Sun, 23 Jul 2006 23:09:56 GMT


Typo 4.0.0

I’d like to announce the release of Typo 4.0.0, the latest version of the most widely-used Ruby-based blogging software. This is the first official release of Typo 4.0, and the product of almost a year’s work by the Typo team. This is a huge upgrade over the previous Typo release, version 2.6.0. You can download it from Rubyforge, or you can use the new Typo .gem and installer.

At least a dozen people have contributed new code to this version of Typo, and dozens more have helped report bugs. The core Typo team has had two new additions since the last major release–Kevin Ballard and Piers Cawley. They’ve both made a huge contribution to this release in many ways. They’ve added all of the useful features, while I’ve mostly specialized in adding bugs.

Here’s a partial list of changes since Typo 2.6:

  • A new installer and a Typo .gem file. Run gem install typo and then typo install /some/path to install Typo.

  • Text filter plugins, including easy inline Flickr image support and syntax highlighting for code.

  • Enhanced feed support. Atom 1.0 and RSS 2.0 are both supported. Atom 0.3 has been removed. Both feed types have better UUIDs. There are also per-tag, -category, and -author feeds. Most pages have their own content-specific feeds available via feed autodiscovery.

  • Tags. The ‘keywords’ field in the Typo admin UI (as well as many blog editors) has been commandeered to provide tagging for Typo. Tags are separated by spaces (just like Flickr). If you want to include a space in a tag, then use quotes.

  • Improved spam management. There’s a “Feedback” tab in the admin interface that lists all comments and trackbacks so they can be bulk-deleted. In addition, Typo can now use Akismet for spam filtering.

  • File uploads. You can now upload images and other content directly from the admin UI.

  • Podcast support (experimental).

  • Email and/or Jabber notification of new content, including comments and trackbacks.

  • Support for posting articles with a future posting date. Pre-posted articles don’t appear on the blog or feeds until their posting date passes.

  • A new cache system that automatically times out stale entries. Several types of content, including the Flickr sidebar, will automatically cause the page to be rebuilt every few hours to ensure freshness.

  • Better theme support. Some of this was back-ported to Typo 2.6.0.

  • A redirect table to help users migrating to Typo. You can enter new URLs into the Redirect table and Typo will look there whenever it doesn’t recognize a URL. So you can move from Movable Type-style permalinks to Typo-style permalinks without losing the perma- in your links.

  • Cleaner migrations.

  • Rails 1.1 support. Rails 1.1.4 is strongly recommended. Rails 1.0 won’t work at all.

  • Improved sidebar support, with a cleaner API and more built-in sidebars.

  • Google sitemap support.

  • Gravatar support for comments.

  • Comment previews.

  • Markup help for comments, articles, and pages.

The single most exciting change for me is the new installer. Typo is almost certainly the world’s most widely distributed Rails app, and we’ve found that it’s really hard for people to get all of Typo’s dependencies installed and working the first time. Even worse, our old documentation wasn’t very helpful. I’ve heard from a lot of people who have spent hours getting Typo working, sometimes without success. My personal favorite comment came from a co-worker:

I tried installing typo last night, and the experience was so comically horrible that I was seriously tempted to blog about it, and make the whole world point and laugh at Typo, haw haw haw.

Properly shamed, we built an installer for Typo. If you’re new to Typo, you can install it like this:

  $ gem install typo
  $ typo install /some/directory

This will install Typo in /some/directory, using SQLite and Mongrel by default. As they say, there is no step three. There are a few prerequisites that you’ll have to have on your system before this will work (Ruby, Gems, SQLite 3, and SWIG), but most people should find the installer a lot easier to work with then the traditional installation mechanism. Of course, if you’re happy with your current Typo install, then there’s no need to use the installer–it’s optional. Checking things out from Subversion or downloading .tar or .zip files still works fine. It’s just more work.

I’ll post more details on the installer here soon. I’m planning on extracting it from Typo and bundling it into its own Rubyforge project so other Rails apps can use it. Let me know if you’re interested.

For now, please report Typo bugs on typosphere.org or the Typo mailing list.

Posted by Scott Laird Sun, 23 Jul 2006 04:26:00 GMT


Typo 3.99.4

Typo 3.99.4 is out. Hopefully this will be the final 3.99.x release, and I’ll be able to release 4.0.0 this weekend.

There are a ton of bugfixes in this release, plus a couple new spam-handling features.

First, Typo now includes Akismet support. Akismet is a blog spam filter implemented as a web service. You’ll need to register with them and get an API key before it’ll work. Once its enabled it’ll work alongside the existing spam blacklist system. If a new comment or trackback fails the spam test then it will be saved in a queue for moderation.

There’s a also new ‘Feedback’ tab in the admin interface that shows all recent comments and trackbacks and allows you to publish, unpublish, and delete comments.

I’m really happy with 3.99.4. Give it a spin. You should be able to run gem install typo once Rubyforge finishes replicating it, or you can download the .gem from my server, or you can download a tarball or zip file from RubyForge.

Posted by Scott Laird Sat, 22 Jul 2006 06:43:12 GMT