More Drupal
Posted by Scott Laird Fri, 21 May 2004 10:43:14 GMT
I spent another hour or so looking at Drupal, and I think I’ve solved most of my problems. First, the mod_rewrite issue. I’m not sure if it is simply an apache2-ism or what, but everything started working once I added the DocumentRoot as a prefix in the RewriteCond lines and added an extra / in the RewriteRule line. These lines are from the VirtualHost block of my Apache 2.0.49 config:
RewriteEngine on
RewriteLog /var/log/apache2/rewrite.log
RewriteLogLevel 9
RewriteCond /var/www/testing.scottstuff.net/%{REQUEST_FILENAME} !-f
RewriteCond /var/www/testing.scottstuff.net/%{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?q=$1 [L,QSA]I also had to prune the DirectoryIndex settings or Apache would try to feed index.html through the rewrite code and come out really confused:
DirectoryIndex index.phpOnce that was done, I could turn on clean URLs and everything worked correctly.
Next, I looked into conversion from Movable Type. There are a pile of scripts floating around out there, but they’re all slightly broken. Several assume that you’re using MySQL, while I’m using Postgres. Others leave out comments. None of them seem to set up Drupal’s path module to do URL rewriting, so old MT URLs won’t work.
So I spent a bit of time and fixed most of the problems. My conversion script is based on one from Tim Allman and Morbus Iff that’s been floating around. It handles comment conversion and adds URI re-writing entries for all blog entries and for each category index. It re-uses MT’s URLs as much as possible, so it’ll probably even do the right thing if you’ve changed URL formats, but don’t quote me on that.
The way the script works is kind of cool–you set it up as a Movable Type index template, and have it export to a file called import.php. Then copy the file to the root of your Drupal tree and then run it via your web browser. The script will execute and import everything that MT gave it into Drupal.
So, at this point, here’s what’s still broken:
- No Markdown filter.
- Trackbacks aren’t converted.
- It needs a better theme.
- I need to write a script or two to integrate the project pages from http://svn.scottstuff.net into Drupal.
- Similarly, it’d be nice to integrate my book event page into Drupal, but I’m not sure how practical that is.

I’ve been uploading new versions of the importer script as I go along - my biggest problem was segfaulting as I tried to import 8000 comments (that, and invalid emails due to people being “cute” about privacy, timestamps that weren’t imported properly), and blah blah blah. The path addition was planned, but I’m glad to see you beat me to it. I’ll take a look at your script shortly.
One improvement to my script that was posted last night was a default (random) password assigned to all new users. The problem was, a blank password was being assigned before. Blank passwords are MD5’d just like normal passwords, and the fear was that people could find newly converted Drupal sites, login as blank’d users, and go nuts.
The other improvement was taking all the operational code and throwing them into functions. This just made the generated template size smaller (my own export was about 13 megs in file size, which was part of the reason PHP seemed to segfault a lot.) The version posted last night also included (more than likely, confusing) support for running the import script multiple times (by using lastn and offset on your Entries). I suspect that won’t be used or needed by a lot of people, but it was necessary for me.
Some of the coding comments in your script are incorrect: it appears you’re running it against 4.4.1 and not CVS HEAD, like I was. As such, my embedded comments are now erroneous for your import iteration.
On your comment importing stuff, the big problem there is timestamps. All your imported comments will be imported with a timestamp of (when the import occurs), not the timestamp of the comment itself. Likewise, using the comment API will reject invalid emails (bad, as I don’t believe the email validation function in Drupal works correctly), and duplicate comments (good). I posted a patch for the timestamp issue. Ultimately, though, I decided to go with a direct SQL import (which, as you mention, would fail on your PgSQL installation) - it seemed faster than the API, and I didn’t want the email validation checks to occur. It was also the only way to get the correct timestamps without the patch. (Be forewarned, though, my SQL in the newly posted script assumes CVS head, so there are three more columnns than 4.4.1 has).
The other big downside of comment importing the way you’ve done it is that all commenters now become automatic users of Drupal. This means that regular users will have to “Request new password” to actually login to their account, as opposed to creating a new one (which would fail since their email is already in the database). This can be confusing and annoying for some (and possible detrimental, as duplicate accounts could be created in their confusion). For a site like Gamegrene, which has 8000 comments, some of which have invalid email addresses in an attempt at privacy, I’d be creating an awful lot of bunk accounts.
Love your path stuff. Just great. I’ll be stealing those tonight. As someone else mentioned, I plan on working on trackbacks too, but I don’t have any decent test data for that, so I’m just gonna have to assume the code works.
Well, I only have a few dozen comments, so I’m not too concerned about a lot of the things that you have to deal with. The locked password thing is probably an improvement. I have one comment that won’t import due to a bad email address, but that’s pretty trivial.
I have a few trackbacks, but I can live without them if I have to. Frankly, I think the conversion script works well enough for me right now that I could drop MT and move today, except my Drupal install isn’t really 100% ready for the content. That’s not exactly how I expected things to go–I expected the conversion to be a much bigger pain.
So very nice to know I’m not the only one who had to dig into the black arts of mod_rewrite to make this drupal thing happen.
I’d used the jseng script and, overall, it was great. The biggest plus was having the translation done on the database side, saving the bother of exporting and importing and transfering the massive dump of 4000+ items. I have to confess that I didn’t try the Morbis Iff; I excluded most options based on that requirement to stay within the database – I haven’t meticulously tested, but it seems most trackbacks and comments survived the jseng transform; I did have to do some funky SQL to fix up [textile] markers and wipe the thousands of path.module entries.
Passwords threw me for a bit of a loop: I guess MT uses short SHA1 passwords, Drupal uses MD5 and thus it’s forgivable the script couldn’t translate; instead it just moves the old passwords across, which means they are effectively locked.
As for stylesheets, that’s a realm of arcana I’ll never understand. I’ve started on a phptemplate theme I call “UnMovable” which uses the same ID and class marks as MT; I thought it would be easy to then just move styles-site.css in as styles.css and be done with it, but as you can see from the rough edges and dangling divs in my fleet of MT-converts, theory and practice are only similar in theory ;)
And I hope you don’t mind my asking if it’s just a tiny bit cowardly to be having this conversation on an MT-based blog :) – for my migration, I took a deep breath, bit my lower lip, damned the torpedos, pulled the trigger, and then burned the bridge behind me.