Push me, pull me
Posted by Scott Laird Mon, 05 Sep 2005 01:25:00 GMT
Someone pointed out today that none of the “convert from your old blog system to Typo” converters in the current Typo development tree were working. They all produce articles without any HTML in them. This was caused by my big filter update from a week or so ago; apparently no one has tried to convert directly to a development version of Typo in the last week or two. The problem is that none of the text filters were running. Unfortunately, there’s no easy way to make them run because they need access to a working Rails controller, and there isn’t one available from inside of the converters.
At the same time, Piers Cawley asked for an easy way to rebuild all of the HTML generated by filters on his site–he was doing filter development and he needed to rebuild everything. Unfortunately, the filter design doesn’t make this easy, either.
These two are basically the same problem–the way that we run text filters is kind of painful in the current Typo tree. In Typo 2.5 and earlier, filters were applied at the model level, and nothing outside of the model really needed to worry about them–the filters were automatically applied every time that the article (or comment, or page) body changed. Due to the changes in the dev tree, this just isn’t possible any more, but I’d tried to hack it together by changing the dozen or so actions that changed Articles. It worked, but it was ugly, and it breaks when something like a converter needs to create a new article, because the converter has no way to run the filter.
So I’ve been making a few changes to Typo.
The basic problem is that we’ve been using a “push” model for updating the HTML version of articles, when we should really be using a “pull” model. That is, instead of updating the HTML when the article changes, we should really be generating the HTML when the article is viewed and then caching the HTML so we don’t have to do it more then once per article.
Fortunately, this change was pretty easy to make–I just had to search for every reference to body_html, extended_html, or full_html and change it to a reference to article_html(article). Then I moved the filter calls into article_html(article), saving the generated HTML back into article.body_html.
Once that was done, I could rip out all of the complicated filtering code that I’d had to put in to make the new filters work right, and everything Just Worked. I had to tweak a few tests that expected the HTML to be available in the database immediately after posting new content, but I already had tests that verified that the content viewed right, so it was just a matter of removing code, not really adding new code.
There’s one more change that I’m debating making. From an architectural standpoint, we shouldn’t really be stuffing things back into body_html–we should be using Rails’ fragment cache. Switching to the fragment cache would be trivial, it would only take a couple extra lines in article_html, and then I could rip a bunch of lines in the editor actions, because I could use a sweeper instead of explicit calls to article.body_html = nil.
Unfortunately, if we do that then we’ll end up killing Typo’s performance when it’s running in development mode, because caching is disabled in dev mode. So it’d be cleaner, but probably too slow to be useful. I’ll probably revisit this again before the next Typo release–there are a bunch of performance tweaks that we need to make before the next release; once those are done, we might be able to stand the performance hit.

As we seem to be missing each other on IRC:
The problem we have here is that
body_htmland friends are not the right answer. What we really want is for a view to simply be able to ask a content object for itsbodyand get back well formed html.So, before our controller passes any content objects to the view (and possibly as early as fetching it from the database) it decorates the object with a text filter and passes the decorator to the view. Then when the view asks for the body, the text filter will do something along the lines of:
If the adaptor (or its class) is also an observer of the content object, the content object can inform it when it’s updated and the adaptor can then nuke the appropriate whiteboard.
Any metadata extractors that play whiteboard games for sidebar plugins also decorate/observe content objects with the wrinkle that they also add to the interface, so our putative ASIN extractor would add an
asinsmethod, which the sidebars would use to get at things, rather than accessing whiteboarrds directly. Which also means that more advanced metadata adaptors could store their data in full blown model objects or whatever.Note too that any notification system that gets added can be used to manage the cache more efficiently.
None of this is implemented just yet, but I do think it’s rather cleaner than what we have.
Damn. You’re not putting comments through a textile filter are you?
It’s Markdown, not Textile–I need to update this site to trunk again, clean up the comment code (which differs a bit from trunk) and turn on the comment markup help–we generate it, it just isn’t visible on the comment page.