Time-limited caching for Rails

Posted by Scott Laird Fri, 20 Jan 2006 17:25:02 GMT

I’m finally getting back into Typo hacking after too long away. I tried to apply a few patches last weekend, but I was traveling and my network access was too spotty. So, I spent my time adding a bit of new functionality. Since then, I’ve been debating whether to commit it to Typo or not. I decided I’d write about it here and see which way the comments go.

The code in question is a time-limited cache for Rails. I’d like to be able to say “cache this page, but only for three hours. After three hours, re-render the page.” This sort of thing comes up in Typo all the time. The most obvious example is the sidebar–some of the sidebar components display information with a short lifetime, and it’d be dumb to keep pages in the cache for weeks when they include sidebar data that’s only good for hours. This isn’t really a problem on busy sites, because the current cache sweeper usually resorts to sweeping the entire cache every time a new article is posted, but it’s a pain on slower sites.

There are certainly other ways to fix the sidebar problem (AJAX sidebars are the obvious example), but the same basic pattern comes up all over Typo. A few examples:

  1. Users keep requesting the ability to create articles with a publication date in the future. The article won’t appear on the site until after the publication date. This is common CMS feature, and apparently other blog engines have it as well, but it really doesn’t mesh with Typo’s current cache, because there’s no way to say “sweep the cache at 7:30 today” short of adding a cron job for every article that’s posted this way.

  2. We have a bunch of aggregation classes that suck data off of other sites, like Flickr, Upcoming.org, and so on. These usually end up as sidebars, but we need to cache the back-end data somewhere. An expiring fragment cache would work perfectly for this.

  3. On really busy sites, we could use something like this to avoid rebuilding comment pages on every comment–we could drop the sweep-on-new-comment code and swap for expire-after-5-minutes. If you’re getting more then 1 comment every 5 minutes, this would be a win. If you’re getting a comment every few seconds (think Slashdot or Curt Hibb’s “hammer my comments” post), this would be a major win.

To accomplish this, I added two new features. First, I added a set of “meta-fragment cache” methods, building on Rails’ existing fragment cache. The fragment cache stores (key, value) pairs, while the meta-fragment code stores (key, value, metadata_hash) triples. This is simply implemented as two fragment cache entries, one for the data and one for the serialized metadata hash.

Then, on top of that, I re-implemented my caches_action_with_params code. This is a variant of Rails’ native action cache with a number of cleanups and bugfixes.

When all is said and done, you’re left with a controller that looks something like this:

class ArticleController < ApplicationController
    caches_action_with_params :read

    def read
      response.lifetime = 3600 # 1 hour
      ...
    end
end

That’s it–the read action will now be cached with a 1 hour lifespan. After an hour, the cached version will expire. If response.lifetime isn’t set, then the page won’t expire on its own, and it’ll need to be swept as usual.

So here’s the big question–should this go into Typo? I can see good arguments on each side.

Pro:

  • It solves a lot of cache-with-parameter problems that we’ve had.
  • Switching to some variant of the action cache means that switching between production and development mode doesn’t leave cache problem. This is a major cause of bug reports from new users.
  • It’ll let us implement future posting easily.
  • It’ll make it easy for sidebars to stay current.
  • It’ll let us move the aggregation backends behind the sidebars to a more reasonable architecture. For example, we’ll be able to use the Flickr class for the Flickr sidebar instead of (mis-)parsing their RSS feed.
  • It’ll make us less dependent on web server configuration and weird rewrite rules.

Con:

  • It’s slower then the page cache. I haven’t benchmarked my new code yet, but the last time I checked, on my box I could handle almost 2400 page cache requests per second, while the action cache was good for *10* hits per second. That exposed a couple major Typo performance bugs; I suspect that retesting with the new code would give us 100-200 hits/second, which is pretty busy for a blog. Still, this may be an issue for shared hosting providers.
  • The action cache serves cached pages via Rails, while the page cache serves the same pages directly from the webserver without invoking Rails at all. Because of this, I suspect that a lot of sites will want to increase the number of FastCGI Typo processes that they run. With the page cache, running with one FCGI process was usually okay; with the action cache, it might be better to use a second process.

Those are the only two major problems that I see. Basically, if we switch to using the action cache (in any form), we’re going to be harder on big hosting companies like TextDrive and Planet Argon, and they’ve been very supportive of Typo in the past.

Does anyone feel strongly about this one way or another?

Tags , , ,  | 11 comments

Comments

  1. Tom Fakes said about 1 hour later:

    I blogged about the Page cache being broken - http://blog.craz8.com/articles/2005/12/30/rails-page-caching-is-just-broken.

    I also have an entry for an Action cache upgrade: http://blog.craz8.com/articles/2005/12/26/rails-action-cache-upgrade

    My Action cache upgrade causes ‘304’ statuses to be sent, and this makes the process a lot faster for things that haven’t changed (I have one page that is 5-6/sec with fragment caching of DB queries and 120+/sec with the action cache serving 304s from the same cache store)

  2. Scott Laird said about 2 hours later:

    I completely forgot about 304 handling. I’ll take a look at your code and see what I can do with it.

  3. vWing said about 4 hours later:

    Forgive me, just a ridiculous comment: in Rails you can write “1.hour” instead of “3600” …

  4. Tom Fakes said about 6 hours later:

    I just packaged this up as a plugin for ease of use: http://wiki.rubyonrails.org/rails/pages/Action+Cache+Update+Plugin

    It wouldn’t be rocket science to combine mine with your ‘lifetime’ member and change the fragmentkey method to something configurable to implement the _withparams functionality.

  5. Tom Fakes said about 8 hours later:

    The plugin now has the ability to change the fragmentkey method to provide the actioncachewithparams functionality

  6. Kevin Ballard said about 10 hours later:

    Go ahead and do it! It would be nice to provide a mechanism (or instructions) for busy sites that want it to drop back to the page cache (perhaps even just a single boolean toggle in environment.rb?), but on probably the huge majority of typo installations the performance is something of a non-issue.

  7. Louis said 1 day later:

    I could use this sort of simplicity in a Rails app I’m thinking of starting work on soon … I’ll definitely keep an eye on this. :)

  8. Chris Anderson said 2 days later:

    Lighttpd allows you to send it a header, which tells it which file to serve. This might significantly speed up action-cache, both for Typo and Rails generally.

    I’m thinking about using this method on a Rails-controlled asset server I’m buidling. (Not for caching, just serving up static content.) But hooked into action cache’s brains, I bet this would drop request time compared to Ruby reading the file off the disk, and socketing it through fcgi, etc.

    The only problem is that you might not be able to set lighttpd to listen to X-LIGHTTPD-send-file headers unless you have control of your server.

    “fcgi process controls lighttpd”:http://trac.lighttpd.net/trac/wiki/HowToFightDeepLinking

  9. Scott Laird said 2 days later:

    Out of curiosity–why would this be helpful when sending small files (like most web pages)? I can see it being useful with large files, but the action cache isn’t really touching anything big right now.

  10. Jared said 3 days later:

    Thank you for implementing this feature! Now I can implement a bunch of dynamic sidebar plugins that I was putting off programming.

  11. Chris Marstall said about 1 month later:

    This is something that needs to be worked into rails. A lot of the Java caching system - ehcache, JCS, etc. support timeouts for cache elements.

    This is a required quality for caches that aren’t distributed - eg caches that are per-blade. For me it’s one of the glaring deficiencies in the rails world.

Comments are disabled