Push me, pull me

Someone pointed out today that none of the “convert from your old blog system to Typo” converters in the current Typo development tree were working. They all produce articles without any HTML in them. This was caused by my big filter update from a week or so ago; apparently no one has tried to convert directly to a development version of Typo in the last week or two. The problem is that none of the text filters were running. Unfortunately, there’s no easy way to make them run because they need access to a working Rails controller, and there isn’t one available from inside of the converters.

At the same time, Piers Cawley asked for an easy way to rebuild all of the HTML generated by filters on his site–he was doing filter development and he needed to rebuild everything. Unfortunately, the filter design doesn’t make this easy, either.

These two are basically the same problem–the way that we run text filters is kind of painful in the current Typo tree. In Typo 2.5 and earlier, filters were applied at the model level, and nothing outside of the model really needed to worry about them–the filters were automatically applied every time that the article (or comment, or page) body changed. Due to the changes in the dev tree, this just isn’t possible any more, but I’d tried to hack it together by changing the dozen or so actions that changed Articles. It worked, but it was ugly, and it breaks when something like a converter needs to create a new article, because the converter has no way to run the filter.

So I’ve been making a few changes to Typo.

The basic problem is that we’ve been using a “push” model for updating the HTML version of articles, when we should really be using a “pull” model. That is, instead of updating the HTML when the article changes, we should really be generating the HTML when the article is viewed and then caching the HTML so we don’t have to do it more then once per article.

Fortunately, this change was pretty easy to make–I just had to search for every reference to body_html, extended_html, or full_html and change it to a reference to article_html(article). Then I moved the filter calls into article_html(article), saving the generated HTML back into article.body_html.

Once that was done, I could rip out all of the complicated filtering code that I’d had to put in to make the new filters work right, and everything Just Worked. I had to tweak a few tests that expected the HTML to be available in the database immediately after posting new content, but I already had tests that verified that the content viewed right, so it was just a matter of removing code, not really adding new code.

There’s one more change that I’m debating making. From an architectural standpoint, we shouldn’t really be stuffing things back into body_html–we should be using Rails’ fragment cache. Switching to the fragment cache would be trivial, it would only take a couple extra lines in article_html, and then I could rip a bunch of lines in the editor actions, because I could use a sweeper instead of explicit calls to article.body_html = nil.

Unfortunately, if we do that then we’ll end up killing Typo’s performance when it’s running in development mode, because caching is disabled in dev mode. So it’d be cleaner, but probably too slow to be useful. I’ll probably revisit this again before the next Typo release–there are a bunch of performance tweaks that we need to make before the next release; once those are done, we might be able to stand the performance hit.