The Great Typo Memory Leak

A number of users complained this weekend that Typo was using way too much memory, with reports of 100+ MB per FastCGI dispatcher. Typo usually uses around 20 MB, and even that’s too much; 100 MB is enough to cause big problems with hosting providers like TextDrive.

The first step that I took was to verify that the problem actually exists outside of TextDrive. I set up a test Apache/FastCGI/Typo server, disabled caching, and then pounded on it using curl:

# while true; do curl http://typo1/ > /dev/null; done

I let that run for a few seconds and watched while my dispatch.fcgi processes grew from 22 MB to 80 MB. I then did a bit of experimenting:

  • The main index page leaked
  • RSS feeds didn’t leak
  • Individual article pages leak
  • Static pages, like /pages/about leak
  • Error pages even leak

Disabling the layout for a leaking page and then re-testing it showed that the leak followed the layout. Turning layouts back on and removing the sidebar block fixed the leak.

Entertainingly enough, disabling the sidebar from inside of the sidebar infrastructure didn’t fix the leak. The mere act of calling render_component to generate the sidebars seemed to be causing the memory leak. Since Typo is one of the very few users of Rails components, this suggests that render_component may have a leak that no one else has noticed, so I created a new test Rails app with only two files. First, app/controllers/foo_controller.rb:

class FooController < ApplicationController
  def bar
    render_component :layout => false, 
      :controller => 'sidebars/sidebar', :action => 'index'
  end
end

Then components/sidebars/sidebar_controller.rb:

module Sidebars
  class SidebarController < ApplicationController
    def index
      render :text => 'test', :layout => false
    end
  end
end

This is about as minimal as a Rails app can get. Then I set up a FastCGI server running this project, and ran curl against /foo/bar, and watched the process size climb. So the leak is part of Rails, not really part of Typo.

Unfortunately, I’m not sure where the leak is coming from. I read component.rb and made a few small changes, but the leak hasn’t stopped. So I’m going to file this as a Rails bug and see if we can get it fixed before 1.0.

Update: Rails bug 2589.

Update: Thanks to Scott Barron, the bug has been fixed. Users with memory problems should probably install the patch, although a bit of testing would obviously be recommended first. The next release of Rails (either 1.0rc3 or 1.0; I’m not sure what they’re planning) should include this fix.

Posted by Scott Laird Mon, 24 Oct 2005 19:30:55 GMT


Rails Schema Generator 0.2.0

I just uploaded version 0.2.0 of my Rails Schema Generator to Rubyforge. This is a minor update, but it was needed to make the schema generator work with Rails 1.0rc2 (AKA 0.14.1).

The schema generator is sort of the flip side of the new schema code in Rails 1.0. It takes a set of Rails migrations, aggregates them all together, and spits out a SQL file that describes the DB that you’d get if you ran all of the migrations. Or, viewed in a more useful light, it gives you a SQL file that you can use to create a new DB from scratch. The current version actually produces three different schema files, one for PostgreSQL, one for MySQL, and one for SQLite, each with DB-appropriate syntax and types.

This is an outgrowth of Typo; we’re up to 25 migrations now, and we actively support 3 different DBs. It was getting really painful to maintain 3 distinct schema files in addition to the collection of migrations, so I wrote this schema generator. Now we’re back to DRY-land–we create new migrations and let the schema generator do all of the hard work.

Posted by Scott Laird Mon, 24 Oct 2005 17:25:30 GMT


Typo 2.5.6 and Rails 1.0

As far as I can see, Typo 2.5.6 (the most recently released stable version) should work fine with Rails 1.0. I just did a brief round of testing with 1.0rc2, and all of the tests pass. Er, except for one test that had a stupid typo that somehow still worked with Rails 0.13.1; the bug is in the test itself, though, so it’s not worth releasing Typo 2.5.7 just for that. If we ever release Typo 2.5.7, then I’ll make sure that the fixed test is included.

Also, Typo 2.5.6 should work just fine with Ruby 1.8.3, too, as long as you’re using Rails 1.0. I haven’t actually tested this yet, but I’d be surprised if it doesn’t work perfectly.

Surprisingly enough, the current Typo trunk (r683 or so) doesn’t work with Rails 1.0. All of the filtering code is broken; I’ll fix it shortly and check in the fix. Fortunately, the current Typo trunk is pinned to Rails 0.13.1 for now, so it should be safe to upgrade the version of Rails on the box; Typo will just ignore the new Rails for now.

Update: the Typo trunk r685 or later should be compatible with Rails 1.0rc2. I’ll probably break Rails 0.13.1 compatibility soon, so it’s time to upgrade.

Posted by Scott Laird Thu, 20 Oct 2005 01:37:53 GMT


Typo and Ruby 1.8.3

Just for the record, current versions of Typo (either 2.5.6 or the current Subversion trunk) don’t work with Ruby 1.8.3. There are two problems–the Logger bug that keeps Rails 0.13.1 from working with Ruby 1.8.3 (this is easy to fix), and a second bug that I haven’t read about anywhere else–apparently YAML serialization is broken with Ruby 1.8.3 and Rails 0.13.1. This keeps Typo’s sidebar from working properly.

I’m going to see what it’ll take to get the Typo trunk working with Rails 1.0 (rc1 or rc2, if it’s out today), and then see if it works properly with Ruby 1.8.3. Once that’s done, the trunk will probably shift from 0.13.1-only to 1.0-only.

Update: That was quick. ChrisNolan on IRC pointed out that Rails bug #2304 contains a patch to fix this. The patch is already a part of the current Rails trunk, but you’ll need to patch 0.13.1 manually if you want to use it with Ruby 1.8.3.

Posted by Scott Laird Wed, 19 Oct 2005 14:51:48 GMT


Rails caches_action_with_params

One of the big problems with caching in Rails is the way that Rails’s caching systems handles query parameters. Page caching completely screws this up–the page cache will turn /articles/read?id=100 into /articles/read.html, and Apache will then hand all future hits on /articles/read off to that static HTML file, even if the user was looking for /articles/read?id=99. You can mostly get around this by making sure that you always use named parameters via Rails’s routes, but even then a malicious user can do weird things to your cache by feeding query parameters via ?.

The action cache is slightly better, but it’ll still misbehave with the examples above. What we really need is a caching system that pays attention to all parameters, not one that ignores all of them that aren’t part of a route.

Towards that end, I’ve created caches_action_with_params. It’s a minor derivative of caches_action with a different fragment cache key; instead of using the URL (as generated by url_for), it ignores URLs completely and uses ACTION_PARAM/<host>/<controller>/<<action>/<params>. This way caching isn’t dependent on routing, which will help with some of the stranger problems that Typo has seen. On the downside, if your actions explicitly check the URL that the user used, then caches_action_with_params won’t work for you.

Once I’m off the train and sitting somewhere with usable IP, I’ll post some sample code to the Typo bug tracker and generate a couple benchmarks. I expect this to about about 10% as fast as the page cache, but it should still be faster then 100 hits/second, which is my personal definition of “fast enough” this week. Then, if no one has any big complaints, I’ll commit this and switch off the page cache.

Once that is done, it’ll be fairly easy to add a lifespan to cached pages, so we can say “this page is only good for 2 hours” and have it regenerate automatically after that.

Posted by Scott Laird Tue, 04 Oct 2005 15:54:24 GMT


Rails caching presentation

I gave a short talk on caching with Rails last night at the Seattle.rb meeting. The short version is “the page cache is going to hurt a lot worse then you’d expect,” but anyone who read my previous article on caching should already know that. I did the slides for the talk with S5, which was new to me–I had planned on using Keynote, but it seems to have died in the year and a half since I last had a use for it. S5 worked well enough, although there were some formatting issues that kept popping up as the browser window size changed. By and large it was easy to use, and it’s nice to have a HTML version of the talk that doesn’t look like a nasty afterthought.

About halfway through preparing for the talk, I realized that I really need to add a new action cache option, something like caches_action_with_params, so we can explicitly say how query strings and other parameters affect the cache. Here’s a bit of sample code:

class ArticlesController < ApplicationController
  caches_action :index
  caches_action_with_params :read, :id
  caches_action_with_params :permalink, :year, :month, :day, :title

  def index
    @pages, @articles = paginate(
      :article, 
      :per_page => config[:limit_article_display], 
      :conditions => 'published != 0', 
      :order_by => "created_at DESC"
  end

  def read  
    @article = Article.find(
      params[:id], 
      :conditions => "published != 0", 
      :include => [:categories])    
  end

  def permalink
    @article = Article.find_by_permalink(
      params[:year], 
      params[:month], 
      params[:day], 
      params[:title])
  end

  ...
end

At least as of Rails 0.13.1, calling /articles/read?id=10 will create a cache entry for /articles/read, which is wrong, and then asking for /articles/read?id=20 will return the cached entry for id=10. Yes, the user is supposed to use routes for this, but explicit query params still work, and there are times when you really need to use them. Fortunately, this is really only 20 lines of code, so it shouldn’t be too hard to write.

Posted by Scott Laird Wed, 28 Sep 2005 15:53:57 GMT


RubyConf '05 registration is closed

DHH just announced that RubyConf ‘05 is now full and they’re not taking any more registrations. When I went in 2002, there were around 50 people, apparently they drew the line at 195 this year.

I was still debating going, too. Oh well. Next year.

Posted by Scott Laird Fri, 16 Sep 2005 01:47:36 GMT


IMAP on Ruby

I just noticed that there is now a Ruby-based IMAP server available. Ximapd is only at version 0.0.4, which suggests that it isn't exactly production-ready, but I'll be following it as it develops over the next few months. Like the Ruby WebDAV server that I talked about a month ago, the big value in these sorts of servers is the fantastic things that you can do when you integrate them into non-traditional contexts. For example, a workflow system that can give a web view of the documents that it manages, while also acting like a file server and an email server. Changes to files or email messages are interpreted as actions in the workflow system and then the system state changes appropriately. That lets the users manipulate the data in the system in natural ways using familiar tools.

Another idea: nn email-support ticket system that only shows internal users the mail associated with open tickets that they own. When the ticket is closed, the mail goes away on its own.

Posted by Scott Laird Tue, 06 Sep 2005 21:25:16 GMT


Rails Schema Generator 0.1.0

I just uploaded the first version of my schema generator to rubyforge. This is a Rails generator that knows how to take a collection of migration scripts and use them to build up a valid SQL schema file.

You should be able to install it via gem install schema_generator, and run it on any Rails project by running ./script/generate schema from the root of the Rails project. The current release (0.1.0) supports MySQL, PostgreSQL, and SQLite. It will auto-generate a schema for each DB in db/schema.DBTYPE.sql every time it runs, prompting you before overwriting existing files.

For this to work, your Rails migrations must describe your complete database schema. Many projects, like Typo, are older then Rails’s migration support, so their migrations don’t start with a clean slate; instead they describe how to migrate from a specific old version of the DB schema to the current version. In this case, either create a 0_initial_schema migration or to modify the existing migration #1 to create all of the original tables. I just committed an example to Typo’s subversion tree, feel free to use it as an example.

Here’s an example of the a schema generated by the generator. This is for Typo on PostgreSQL, as of migration #14. I had to create a db/migrate/0_initial_schema.rb file, but all of the other migrations were completely untouched.

The schemas for MySQL and SQLite are similar, but use the correct types (like int(11)) and syntax for each DB.

-- This file is autogenerated by the Rail schema generator, using
-- the schema defined in db/migration/*.rb
--
-- Do not edit this file.  Instead, add a new migration using
-- ./script/generate migration <name>, and then run
-- ./script/generate schema

-- tables 

CREATE TABLE articles (
  id serial primary key,
  title character varying(255),
  author character varying(255),
  body text,
  body_html text,
  extended text,
  excerpt text,
  keywords character varying(255),
  allow_comments integer,
  allow_pings integer,
  published integer DEFAULT '1',
  created_at timestamp,
  updated_at timestamp,
  extended_html text,
  guid character varying(255),
  permalink character varying(255),
  user_id integer,
  text_filter_id integer
);

CREATE TABLE articles_categories (
  article_id integer,
  category_id integer,
  is_primary integer
);

CREATE TABLE articles_tags (
  article_id integer,
  tag_id integer
);

CREATE TABLE blacklist_patterns (
  id serial primary key,
  type character varying(255),
  pattern character varying(255)
);

CREATE TABLE categories (
  id serial primary key,
  name character varying(255),
  position integer,
  permalink character varying(255)
);

CREATE TABLE comments (
  id serial primary key,
  article_id integer,
  title character varying(255),
  author character varying(255),
  email character varying(255),
  url character varying(255),
  ip character varying(255),
  body text,
  body_html text,
  created_at timestamp,
  updated_at timestamp
);

CREATE TABLE page_caches (
  id serial primary key,
  name character varying(255)
);

CREATE TABLE pages (
  id serial primary key,
  name character varying(255),
  user_id integer,
  body text,
  body_html text,
  created_at timestamp,
  updated_at timestamp,
  title character varying(255),
  text_filter_id integer
);

CREATE TABLE pings (
  id serial primary key,
  article_id integer,
  url character varying(255),
  created_at timestamp
);

CREATE TABLE resources (
  id serial primary key,
  size integer,
  filename character varying(255),
  mime character varying(255),
  created_at timestamp,
  updated_at timestamp,
  article_id integer
);

CREATE TABLE sessions (
  id serial primary key,
  sessid character varying(255),
  data text,
  created_at timestamp,
  updated_at timestamp
);

CREATE TABLE settings (
  id serial primary key,
  name character varying(255),
  value character varying(255),
  position integer
);

CREATE TABLE sidebars (
  id serial primary key,
  controller character varying(255),
  active_position integer,
  active_config text,
  staged_position integer,
  staged_config text
);

CREATE TABLE tags (
  id serial primary key,
  name character varying(255),
  created_at timestamp,
  updated_at timestamp
);

CREATE TABLE text_filters (
  id serial primary key,
  name character varying(255),
  description character varying(255),
  markup character varying(255),
  filters text,
  params text
);

CREATE TABLE trackbacks (
  id serial primary key,
  article_id integer,
  blog_name character varying(255),
  title character varying(255),
  excerpt character varying(255),
  url character varying(255),
  ip character varying(255),
  created_at timestamp,
  updated_at timestamp
);

CREATE TABLE users (
  id serial primary key,
  login character varying(255),
  password character varying(255),
  email text,
  name text
);


-- indexes 

CREATE  INDEX articles_permalink_index ON articles (permalink);
CREATE  INDEX blacklist_patterns_pattern_index ON blacklist_patterns (pattern);
CREATE  INDEX categories_permalink_index ON categories (permalink);
CREATE  INDEX comments_article_id_index ON comments (article_id);
CREATE  INDEX page_caches_name_index ON page_caches (name);
CREATE  INDEX pings_article_id_index ON pings (article_id);
CREATE  INDEX trackbacks_article_id_index ON trackbacks (article_id);

-- data 

INSERT INTO sidebars ("staged_position", "active_config", "active_position", "controller", "staged_config") VALUES(NULL, NULL, 0, 'category', NULL);
INSERT INTO sidebars ("staged_position", "active_config", "active_position", "controller", "staged_config") VALUES(NULL, NULL, 1, 'static', NULL);
INSERT INTO sidebars ("staged_position", "active_config", "active_position", "controller", "staged_config") VALUES(NULL, NULL, 2, 'xml', NULL);
INSERT INTO text_filters ("name", "filters", "description", "params", "markup") VALUES('none', '--- []', 'None', '--- {}', 'none');
INSERT INTO text_filters ("name", "filters", "description", "params", "markup") VALUES('markdown', '--- []', 'Markdown', '--- {}', 'markdown');
INSERT INTO text_filters ("name", "filters", "description", "params", "markup") VALUES('smartypants', '--- 
- :smartypants', 'SmartyPants', '--- {}', 'none');
INSERT INTO text_filters ("name", "filters", "description", "params", "markup") VALUES('markdown smartypants', '--- 
- :smartypants', 'Markdown with SmartyPants', '--- {}', 'markdown');
INSERT INTO text_filters ("name", "filters", "description", "params", "markup") VALUES('textile', '--- []', 'Textile', '--- {}', 'textile');

-- schema version meta-info 

CREATE TABLE schema_info (
  version integer
);

insert into schema_info (version) values (14);

Posted by Scott Laird Sat, 03 Sep 2005 08:09:00 GMT


Rails schema generation is nearly complete

My Rails Schema Generator is nearly complete. Here’s a sample run:

$ ./script/generate schema
Found 6 migration classes
Starting migration for AddSidebars
Starting migration for AddCacheTable
Starting migration for AddPages
Starting migration for AddPageTitle
Starting migration for AddTags
Starting migration for AddTextfilters
Adding TextFilters table
Migrations complete.
 Tables found:   6
 Indexes found: 1
 Records found:   8
      exists  db
overwrite db/schema.postgresql.sql? [Ynaq] y
       force  db/schema.postgresql.sql
overwrite db/schema.mysql.sql? [Ynaq] y
       force  db/schema.mysql.sql
overwrite db/schema.sqlite.sql? [Ynaq] y
       force  db/schema.sqlite.sql

The migration classes that I’m using are copied straight from Typo without modification. I’ve left out all of the migrations that add features to “legacy” tables–tables like articles–since there isn’t a table definition that I can use. That’s my next project–adding a 0_initial_schema migration for Typo. Once that’s complete, I have a bit of code cleanup and then I’ll release my schema generator code to the world. Hopefully that’ll be later today.

Posted by Scott Laird Fri, 02 Sep 2005 22:56:00 GMT