I am swimming in spam

I am swimming in spam. Every where I go, every direction I look, every medium I deal with, I am being spammed. Spam in my email box I can handle–my spam filter manages that well enough that I can ignore the problem. It’s all of the other spam that is driving me insane.

Let’s start with blog spam. I run Movable Type, and I have a PageRank of 6 or so, so like everyone else with a good PageRank, I’m being bombarded with blog comment spam. It’s not uncommon to wake up in the morning and find that 100 ads for Viagra or online poker or something less savory have managed to make it through my filters and pollute my blog. From looking at my logs, I’ve been had over 7,000 comments posted on this blog, with only 150 or so being legitimate, and around 6,000 blocked by MT-Blacklist.

Then there’s phone spam–the Do Not Call list has actually worked pretty well for my home phone number, but I’ve been besieged by calls from (905)-482-1663 for the past couple weeks. I assume that they’re a telemarketer, but I’ve never been able to figure out what they want–even when I’ve picked the phone up on the first ring, they just hang up on me. Google suggests that that number has done work for Bank of America and the Kerry campaign and pissed off a number of other people; it’s not just me. After a week of this, I had Asterisk blacklist them, so I don’t have to listen to them hang up on me 2 or 3 times per day. Yesterday, they escalated–they called my cell phone 3 times last night. I sent a Do Not Call list complaint today, but I doubt it’ll take. I’d probably be better off using one of the other laws on the books regarding telemarketing calls to cell phones or percentages of hangups, but it’s probably not worth the hassle.

My work phone isn’t immune, either–I’ve been getting 2 or 3 calls per week from random business magazines, wanting to give me free subscriptions or renewals. Frankly, I receive so many magazines that I can’t keep track of which ones I’m already getting–95% of them go straight into the recycling bin without ever being opened. I really don’t want more–my mailbox is too full as it is. Last week, I got two calls from Information Week and had to hang up on them–they wouldn’t take “no” for an answer. The week before, it was a call, a fax, and two emails from Network World. This morning, it was eWeek.

Thinking about all of these–the blog spam, the telemarketer spam, and the magazine renewal spam–the common thread is that none of them are actually trying to sell me anything. The blog spam is trying to increase their own PageRank. The magazine spam is trying to increase their circulation size and advertising rates. The telemarketer might be trying to sell me something, but since they refuse to actually talk to me, I can’t really tell. Largely, they’re all bothering me because they can sell something that I have (eyeballs, highly ranked blog) to others, and they don’t care that they’re wasting my time and money in the process.

Posted by Scott Laird Fri, 17 Dec 2004 18:52:52 GMT


Comment spam explodes

I don’t know if it’s just me or if everyone is seeing this, but the amount of blog comment spam that I receive has exploded lately.

Spam Trend

So far this month, MT-Blacklist has blocked over 2,400 comment spam attempts. That doesn’t count the number that it missed–that has to be at least 200 more, including *12* so far while I’ve been writing this message. The latest couple batches don’t even seem to be obvious spam–they include a fake email address and some text, but no web pages, either in the URL field of the comment or in the body, and the text is generic. If I wasn’t receiving a few dozen per hour from different IP addresses with the same basic text, I’d assume it was just a deranged poster or two. As it is, I can only assume that it’s an attempt to pollute a Bayes table with bogus text, except MT-Blacklist doesn’t use Bayes–it’s just keyword matching.

At the present rate, I think I’m actually seeing more comment spam attempts then legitimate page views some days. I think I’m getting more blog spam then email spam, too, but it’s a close race.

I swear, I need to move off of MovableType 2 one of these days, but the last time I tried, I just couldn’t find anything that I was willing to spend the effort on. Drupal is nifty, but it’s not really what I’m looking for. MT 3.1 would probably work, but it’s not exactly what I want, either. I keep waiting for one of the Rails-based blog systems to become usable, but I don’t think we’re quite there yet.

Posted by Scott Laird Tue, 30 Nov 2004 23:50:16 GMT


The Blog Upgrade question strikes again

I installed Movable Type when I first started this blog, but I’ve been itching to change for months. A small part of that itch is Movable Type’s new pricing model, but it’s really more then that. I have a number of needs that MT isn’t really filling, and I’d like to move to something that works better for me.

The big problem is that I can’t find anything that’s quite right. I looked at Drupal for a while, but there are a few things with it that I just couldn’t cope with:

  • It’s a much bigger system then I really need, with a lot of complexity.
  • It’s written in PHP. If I could treat it as a black box, I wouldn’t really care, but I couldn’t because…
  • It’s essentially hard-coded to need MySQL. In theory, it’ll work with PostgreSQL, but I fought with it for days without actually getting it to work. There were a number of deeply-embedded MySQLisms in the code that I just couldn’t fix, even after digging into the code for a while.
  • PHP’s SQL code is too scary to look at. While the core of Drupal goes to great lengths to prevent SQL injection attacks, a number of add-in modules looked pretty clueless. In addition, all of the SQL code is built up using command = 'insert into foo (a,b) values ("'+value1+'","'+value2'");'-style commands, which are inherently ugly and prone to problems. I really prefer the Perl (and Ruby) DBI version: insert into foo (a,b) values (?,?), where you provide value1 and value2 as parameters to the DBI execute function.
  • Template modifications are a royal pain compared to MT. Out of the box, all of the templates used HTML tables, unlike MT’s clean CSS-only templates.

Now, if I was setting up a big community site, none of these would really matter to me. I could spend a couple weeks on templates. Heck, I’d expect to spend a while tweaking things until they worked right for me. If I was doing this from a corporate perspective, I could just hire someone with experience in Drupal, like Bryght. But I’m not building a big community site, and I’m not willing to pay someone to do it for me. I’m largely doing this for the fun of it, and Drupal doesn’t seem to be a lot of fun.

So I’m back looking again. MT has cleaned up their prices, so I could just install MT 3.1 and be done with it. It wouldn’t be a lot of work to upgrade, and there’d be a handful of benefits, but it still wouldn’t give me an HTML photo gallery or a decent interface for static non-blog pages.

I’m fighting off the urge to use Rails to write a blogging system for myself. Hopefully, if I fight off the urge long enough, then someone else will do it for me, and I can just take their framework and adapt it to my needs. One can always hope :-).

Posted by Scott Laird Wed, 01 Sep 2004 16:27:00 GMT


One Year of Blog

My first post here was one year ago today. I’m sort of amazed that this blog is still going strong 257 entries and 113 comments later. When I was younger, I doubt it would have lasted a month. Apparently I’ve grown. I’ve enjoyed having an outlet for product reviews, interesting tech finds, Asterisk configuration examples, and all of the interesting bits that float through my life. Judging by my web server access logs, I’m not alone–I’m certainly not getting zillions of hits per day, but I have a few steady readers, and Google brings in boatloads of people searching for answers. Hopefully I’ve been able to help a few of them.

To commemorate the occasion, I’m going to post some statistics.

Most widely read posts

RankArticleHits
1Faxing with Asterisk1,865
2Treo Ace1,706
3Motorola MPx200948
4Sony-Ericsson CAR-100799
5Tungsten T4 rumors789
6Saw a FlipStart today763
7ECS EZ30 Mini-tablet PC708
8vCard to Asterisk conversion tool673
9Asterisk config example618
10More Time with the MPx200516

Most frequent referrers

RankReferrerHits
1http://www.google.com7521
2http://scottstuff.net7135
3http://www.voip-info.org695
4http://search.yahoo.com446
5http://www.google.co.uk360
6http://www.google.de332
7http://www.google.ca325
8http://www.feedster.com221
9http://digit.que.ne.jp193
10http://www.handtops.com177

Most common browsers

RankBrowserHits
1Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)8518
2Mozilla/5.0 [en] (Windows NT 5.0, U)7614
3Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)6919
4Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)3830
5Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1; aggregator:NewsMonster; http://www.newsmonster.org/) Gecko/200211302479
6NewsGator/2.0 (http://www.newsgator.com; Microsoft Windows NT 5.1.2600.0; .NET CLR 1.1.4322.573)2456
7NetNewsWire/1.0.9b1 (Mac OS X; http://ranchero.com/netnewswire/)2381
8NetNewsWire/1.0.6 (Mac OS X; http://ranchero.com/netnewswire/)2009
9Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/124 (KHTML, like Gecko)2004
10Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)1926

Posted by Scott Laird Wed, 25 Aug 2004 16:11:10 GMT


Uhm, who stole my formatting?

That was weird. For at least a few hours, none of the Markdown-formatted entries on this site were formatted properly. MT refused that Markdown was even installed, even though Markdown.pl was correctly installed in the plugins directory. I grabbed a newer version of Markdown and re-installed it, and suddenly everything works.

I hate software some days.

Posted by Scott Laird Tue, 15 Jun 2004 13:44:33 GMT


More Drupal

I spent another hour or so looking at Drupal, and I think I’ve solved most of my problems. First, the mod_rewrite issue. I’m not sure if it is simply an apache2-ism or what, but everything started working once I added the DocumentRoot as a prefix in the RewriteCond lines and added an extra / in the RewriteRule line. These lines are from the VirtualHost block of my Apache 2.0.49 config:

RewriteEngine on
RewriteLog /var/log/apache2/rewrite.log
RewriteLogLevel 9
RewriteCond /var/www/testing.scottstuff.net/%{REQUEST_FILENAME} !-f
RewriteCond /var/www/testing.scottstuff.net/%{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?q=$1 [L,QSA]

I also had to prune the DirectoryIndex settings or Apache would try to feed index.html through the rewrite code and come out really confused:

DirectoryIndex index.php

Once that was done, I could turn on clean URLs and everything worked correctly.

Next, I looked into conversion from Movable Type. There are a pile of scripts floating around out there, but they’re all slightly broken. Several assume that you’re using MySQL, while I’m using Postgres. Others leave out comments. None of them seem to set up Drupal’s path module to do URL rewriting, so old MT URLs won’t work.

So I spent a bit of time and fixed most of the problems. My conversion script is based on one from Tim Allman and Morbus Iff that’s been floating around. It handles comment conversion and adds URI re-writing entries for all blog entries and for each category index. It re-uses MT’s URLs as much as possible, so it’ll probably even do the right thing if you’ve changed URL formats, but don’t quote me on that.

The way the script works is kind of cool–you set it up as a Movable Type index template, and have it export to a file called import.php. Then copy the file to the root of your Drupal tree and then run it via your web browser. The script will execute and import everything that MT gave it into Drupal.

So, at this point, here’s what’s still broken:

  • No Markdown filter.
  • Trackbacks aren’t converted.
  • It needs a better theme.
  • I need to write a script or two to integrate the project pages from http://svn.scottstuff.net into Drupal.
  • Similarly, it’d be nice to integrate my book event page into Drupal, but I’m not sure how practical that is.

Posted by Scott Laird Fri, 21 May 2004 10:43:14 GMT


Checking out Drupal

As I mentioned before, I’m sort of looking for something to replace Movable Type for this website.

This is only partially related to MT’s recent pricing announcements; I can get by with the free single-blog version of MT, at least for now. Rather, I’d like to expand my basic blog to include some of the pieces of http://svn.scottstuff.net and my photo gallery, with a single comment engine tying everything together. And MT just isn’t up to it right now.

I spent some time this weekend looking into Drupal, which is a heavy-weight system that can scale down to individual blogs, but scales up to large community or political sites. The Dean campaign was using Drupal, for example. It has a few things going for it:

  • Supports non-blog static pages, with or without comments.
  • Has an active community.
  • Has atom and wiki plugins (untested).
  • Has gallery photo plugin.

On the other hand, I’ve hit two serious snags:

  • I can’t find a Markdown plugin for it. Google suggests that it exists, but I haven’t seen it anywhere. I could probably just take the Textile plugin and graft the PHP Markdown library into it.
  • It’s written in PHP.
  • I can’t get its “clean URL” support to work right. By default, Drupal uses “index.php?q=…” URLs, but you can get it to generate nice, clean URLs with very little work. Just click one radio button in the web configuration UI, and there you go. Er, once you get mod_rewrite working with Apache. I spent two hours on it, fighting with mod_rewrite and Apache 2, and the best I could do was get it to give me a 400 Bad Request error whenever I fed it a URL that should have been rewritten. The rewrite log looks okay, but it doesn’t actually work.

If I can fix the two big problems, then I’ll probably build a test site using the regular scottstuff.net contents, and see where things go from there.

Posted by Scott Laird Wed, 19 May 2004 22:22:12 GMT


Blog Editing with Google

No, not some new feature in one of Google’s many blogging attempts. Something simple.

I want a blog editor that will let me select a block of text, and then search Google for me, providing me with a list of URLs from a search for the text provided. Then, when I pick one, the text and URL will be turned into a link. Nothing fancy, just a quick way to generate obvious links. For instance, in the last post, I ended up searching Google a half-dozen times to find good URLs for the links in the post. I wish Ecto could automate that for me.

Bonus points for generating Markdown links instead of raw HTML.

Posted by Scott Laird Fri, 14 May 2004 16:38:52 GMT


Wow, it looks like a bad day to be Six Apart

The new pricing for Movable Type (the software that runs this blog, along with a zillion others) is out. The previous release was free for non-commercial use. The new release is free–if you only have one author and fewer then 3 blogs. Any more then that, and you need to pay. The cheapest license is $99, marked down to $69 for the moment, and that only covers 3 authors. For commercial users, pricing starts at $299 (on sale now–only $199) for 5 authors, and goes up to $699 ($599) for 20 authors/15 blogs.

Needless to say, this is causing a bit of an uproar, and a lot of people are looking at switching from MT to other systems.

I guess I’m probably one of them. I’ve been half-heartedly looking for a different system for a while, but my needs are kind of unusual (as usual :-). Here’s a short list of what I’m looking for:

  • Simple, customizable blog engine, supports RSS and Atom, as well as at least one API supported by Ecto and NetNewsWire’s editor. Atom API support would be nice, but not all that critical, since I don’t have an Atom-aware editor yet.
  • Trackback and comment support. Preferably threaded. I actually like the concept behind Six Apart’s TypeKey, but that’s too much to ask, probably.
  • Support for non-blog pages. Take a look at http://svn.scottstuff.net for an example. Most of the pages are auto-generated, but I’d like to be able to share the template with my blog, and it’d be nice to be able to use the same comment engine.
  • Support for the Markdown markup language. I’ve found it to me vastly easier to work with then writing raw XHTML. That’s not to say that HTML is hard, but Markdown really lowers the amount of effort required.
  • Decent comment-spam tools. Admittedly, most comment spam is keyed to MT’s comment system, but that’ll change.
  • Tools for converting from MT. I don’t mind spending a bit of time on this, and I only have 190-ish posts here, but I’m not throwing them away, and it’d be nice to save the comments and trackbacks, too.
  • A photo gallery system that doesn’t suck. Since I haven’t found one that doesn’t suck yet, this is a difficult requirement. My goal is to be able to maintain one big master index in iView MediaPro on my Mac, and then sync the pictures and metadata onto my server from time to time, mostly using rsync and xml. Then, I want an automated script to pre-render thumbnails (on-demand thumbnails of 6 MP images are too slow for my poor server) and lay everything out. I’m currently using Album, but I’m not particularly fond of it. It just works better then anything else I’ve used. Systems that require manual, non-scriptable uploading of individual images need not apply.
  • A semi-integrated Wiki’d be nice, but I doubt I’d use it any time soon.
  • It needs to be scriptable and easy to enhance. Ideally, it’d be written in a language that I’m comfortable with; Ruby’d be best, and Perl’s okay. I can cope with Python and PHP, but I don’t really like either. A decent XML RPC/SOAP/REST interface would be nice, too.

If anyone has any suggestions, please leave comments. I suspect I’ll hear at least one recommendation for Drupal, but it’d be nice to hear other suggestions too. Can Drupal handle semi-static non-blog pages easily?

Posted by Scott Laird Fri, 14 May 2004 16:14:25 GMT


Trying out new blogging tools

I’m trying out new blog tools this week. I started yesterday with the ecto blog editor, just to see if it works better for me then the editing component of NetNewsWire. For what it’s worth, I’m still undecided. It has a few really cool features (including automatically using your blog’s CSS in its internal preview), and it seems to be a bit more robust then NNW’s editor (it can actually download all of my categories, a feat which has continually stumped NNW), but I don’t know that it’s worth the extra bit of cash.

I’m also trying out the Markdown and SmartyPants plugins for MovableType. Markdown is a smart text-to-HTML converter—with it, you can type:

_this should be in italics_

instead of:

this should be in italics

Personally, the incessant < and > have been irritating me a lot recently. I can do HTML, but I’d rather not have to prove it every time I write something here. Markdown can do more then just italics:

With [Markdown](http://daringfireball.net/projects/markdown), 
it's easy to do things like lists
  * of
  * stuff
  * like
  * this

With Markdown, it’s easy to do things like lists

  • of
  • stuff
  • like
  • this

The SmartyPanty plugin is less complex. All it does is convert simple ASCIIisms like quotes and dashes into correctly-typeset quotes and en- and em-dashes.

“Bob,” said Tom, “‘you can nest quotes’ was all she said to me–I didn’t believe her, though.”

I’ll probably also try out the Shrook RSS reader, thanks to Cory Doctrow’s recommendation on Boing Boing this morning. I like NetNewsWire, but it’s really slow on my PowerBook G4/550, so I’m open to looking a new tools.

Update (Mar 25, 2004): Shrook just didn’t cut it for me. It was interesting, but I like NetNewsWire’s Combined View way better. The “smart playlist” bits in Shrook look promising, but it felt like could skim through stuff in NNW 2–3 times faster then in Shrook. So I deleted it. Its an interesting program, but it doesn’t seem compatible with the way I want to read RSS feeds. So I’ll stick with NNW for now, but I’m planning on looking into Gush when it’s available for OS X.

Posted by Scott Laird Wed, 24 Mar 2004 01:28:25 GMT