One of the biggest improvements in Typo 2.5 is page caching. By using Rails’s built-in page cache, we can get 100x the performance on many benchmarks without doing more then a few lines of work. This lets us serve high-volume weblogs (like weblog.rubyonrails.com) without requiring heroic measures like clustering.

Unfortunately, there are a number of hidden problems with Rail’s 0.13.1’s page cache implementation. We’ve had to work around a number of them in order to get Typo 2.5 out the door.

Basic page cache usage

Enabling Rails’s page cache is amazingly simple–just add caches_page :actionname to the top of your controller class and the :actionname action will spit out page cache files automatically. A couple small tweaks to Apache’s .htaccess file, and Apache will now serve cached files all on its own without involving Rails. If a client asks for http://blog.example.com/articles/2005/08/08/foo, Apache will first check for a articles/2005/08/08/foo.html file in Typo’s public directory. If that file exists, then it’s sent off to the client without touching Rails at all.

Sweeping

That part of caching is easy. It’s the other end that’s hard: sweeping the cache to remove stale cache entries. Rails provides a simple cache sweeper that can remove specified pages, but that’s not really good enough for us. With Typo, there are a number of events that end up touching a huge number of cached files. Adding a comment, for example, touches the cached article page, but it also changes the comment counter on the main index (if the article is still on the front page), the day, month, and year indexes, some number of category indexes, tag indexes, and potentially paginated versions of all of the above. The code to track these all down was trouble-prone and frequently missed one of the pages that needed to be changed; this led to stale caches. Even worse, some actions, like changing themes, need to invalidate all pages. Rails’s page cache doesn’t keep a list of cached pages, so there’s no clean way to sweep them all.

What we ended up doing was adding a page_caches table to the database and adding hooks to insert a new PageCache entry every time a page was cached. We also added a hook to remove entries from the page cache table whenever a page was manually swept, and then added a PageCache.sweep_all method to flush the entire page cache. For now, we’ve simply ripped out all of our old “smart” sweeping code and force a full sweep of the entire cache whenever anything substantial changes. Sooner or later we’ll start adding smart cache sweeping back in, but for now this works surprisingly well.

Query Parameters and Aliasing

Another shortcoming of Rails’s page cache implementation shows up when you start using query strings. Asking for http://blog.example.com/articles?page=2 ends up handing the ?page=2 parameter to the static .html cache page if it exists instead of calling Rails to ask for page 2. Even worse–if this cached page doesn’t exist, then Rails will generate it and store it for future access, even though it’s the second page of the index, not the first.

Finally, and worst of all, in Typo http://blog.example.com/articles is actually equivalent to http://blog.example.com/, because the article index view is the default index page. This means that the cached page for http://blog.example.com/articles?page=2 is actually /index.html, so anyone visiting page 2 of the article index screws up the front page of the blog. There’s no easy way around this with Rails 0.13.1; for now we’ve had to do work to keep ?page= from paginating anything. There’s one point that we could interrupt the page cache process from inside of Typo, but it doesn’t have any way to see the @request object or any of the query strings.

Long-term, we’re going to need to patch Rails to add a cachable property to @request that gets set to false when there’s a query string present, and also tweak Apache’s rewrite rules to skip static files if a query string is present. That assumes that Apache is even able to do that–every time I read the mod_rewrite documentation I end up with a headache. Since Typo officially supports lighttpd as well as Apache, we’ll need to get both of them to do the right thing, which is far from trivial.

Non 7-bit ASCII URLs and Caching

Finally, Rails screws up cached filenames when the URL has non-ASCII characters. So any URL with accented characters or any non-ASCII script is totally uncachable. At least with Apache and Webrick, Rails sees non-ASCII characters in the URL encoded using the usual %XX URL-encoding scheme. Unfortunately, both servers actually look for unencoded filenames. So Rails writes out the cache file for /foö as public/fo%C3%B6.html (assuming UTF-8 encoding), but Apache actually looks for public/fo<C3><B6>.html (where <C3> is a byte with the value of C3 in hex). This is actually not all that hard to fix–just add a URI::Util.decode to the right place inside of Rails–but it’s not clear what the security implications of this are.

Given all of these problems, I’ve been tempted to try using Rail’s action cache instead of the page cache–the action cache doesn’t let Apache serve the cached files directly, so Typo would have a brief chance to block the cache from handling specific files, and we could approach sweeping from the opposite direction. It’s not clear how big of a speedup the action cache would actually give us, though, compared to the massive win that we get from the page cache. We’d really like to keep using the page cache and fix all of its bugs to its usable by other Rails users.