Tiger: iCal is more strict about .ics files

Posted by Scott Laird Sun, 01 May 2005 14:19:21 GMT

Apparently Tiger’s iCal is more strict then Panther’s, because the .ics files created by my author readings page won’t import into iCal anymore. There are a handful of other issues with the author readings code that need fixed, too, so I’ll see what I can do with it later this afternoon.

Posted in  | Tags , ,  | 4 comments

Do-it-yourself ISBN updates

Posted by Scott Laird Mon, 05 Jan 2004 09:32:33 GMT

I just added another round of bug fixes and upgrades to my book events page. First, I fixed another round of University Books format weirdness, where some phrases (“STANDING ROOM ONLY”) were interpreted as the author’s name. Second, I fixed some similar problems with Third Place books. Third, and most importantly, I added a way for users to enter the ISBN numbers of books that don’t currently have ISBNs. I have a script that tries to match up author name/title pairs with ISBNs via Amazon’s web services, but sometimes it doesn’t work. Like, when Third Place Books mis-spells the author’s name, just to pick on one bookstore. So, I have been manually entering these via a command-line tool. Now, everyone can play along–whenever an event without an ISBN is displayed, an ISBN entry box will appear. Just enter the ISBN number there, and the web service code will kick off, filling out the correct (according to Amazon) title and author, along with a few other fields. It’s still wrong sometimes, but it’s better then anything else I can come up with, short of the bookstores getting their data right in the first place :-).

Posted in  | Tags ,  | 3 comments

New book event page

Posted by Scott Laird Tue, 25 Nov 2003 01:26:12 GMT

I have a new CGI book event page up. I’ve had it sitting around for a while, but I never actually told anyone. I just finally added the redirect, so my CGI book event page is now the default at http://scottstuff.net/books, instead of PHP iCalendar. The iCalendar version was easy to put up, but it’s painfully hard to read; the new one should be quite a bit better.

Also, I noticed that Jimmy Carter, Tom Douglas, and Terry Brooks are signing books next month.

If you’re still using the PHP iCalendar-generated RSS feed, you should really flip to the new one. It has better detail, and it’s not such a huge hit on my poor server. I’m probably going to kill off the PHP one eventually.

Posted in  | no comments

More book event progress: proper RSS

Posted by Scott Laird Tue, 18 Nov 2003 03:12:45 GMT

I just took a couple minutes and fixed most of the known author reading/signing database problems. I’m now building native RSS feeds for each store as well as a city-wide feed. Each feed comes in two sizes: full sized and 15-item-long. Each item is linked to my event CGI that lists the author name, event, time, ISBN, and provides a downloadable iCalendar link that will add itself to your calendar of choice (only tested with iCal for now, but it should work with Outlook).

Basically, it all works.

Here are the RSS 1.0 feeds:

Next up:

  • Add an events CGI and use it to replace/supplement the PHP iCalendar page
  • Work on recommendation-driven event feeds
  • Add more stores (Barnes and Noble?

Posted in  | 1 comment

More advances on the book-signing front.

Posted by Scott Laird Sat, 08 Nov 2003 18:29:08 GMT

I made a bit more progress on the book-signing front. My RSS generator produces valid RSS/1.0 now. I still need to add a bunch of things to it (it’s just stock RSS 1.0, no DC, no mod_event, nothing), but it works. There’s a sample online now, but don’t depend on it quite yet; I’m going to break it at least once before it’s complete. It’s kind of big; it currently includes 75 events, not sorted into any particular order.

I also added a database-driven CGI for each event; the RSS links to it. I did a bit of extra cleanup, too, so I now have better URLs for Third Place Books events.

Next up, I’d like to generate .ics files for individual events via CGI (mostly trivial), produce better RSS, produce static RSS for each bookstore, and then start in on more dynamic stuff, like recommendation-driven filtering.

Posted in  | no comments

Book reading database improvements

Posted by Scott Laird Fri, 07 Nov 2003 06:27:07 GMT

Well, I’m making a bit of progress on the author reading/signing front. Instead of generating .ics files directly, the HTML reader is now feeding a database, and the .ics (and soon RSS) files are generated from the database. I’m now extracting book titles from Elliott Bay Books; all I had before was author names (the two bits of information are hiding on different pages). I’m also using one of Amazon’s web service interfaces to search for books matching the author and title listed, and turning that into an ISBN number.

This buys us a few things; first, we have better data, because we can cross-reference author and title information with Amazon. Amazingly enough, I found at least one mis-spelled author name. Second, because all of the data is sitting in an easy-to-query form in a database, it’s a lot easier to build an RSS feed from it. In fact, I already have a working RSS 1.0 writer, but it’s still a bit rough, so I’m going to hold off announcing RSS feeds until I get it working correctly. Finally, by having the ISBN number, we’re set to tie into other book-related services, like All Consuming, so we can do things like genre- or recommendation-driven lists of upcoming events. That’s still a ways away, though.

Posted in  | no comments

Book event RSS working really well

Posted by Scott Laird Wed, 05 Nov 2003 00:28:54 GMT

The RSS feed from PHP iCalendar is working quite nicely. It’s actually quite a bit more useful then the calendar itself, because (assuming a decent RSS reader), you should see changes as they happen, rather then a monolithic block of 100+ events.

In fact, this is starting to look really useful. If you’re interested in knowing when authors are visiting the Seattle area for book signings and talks, then subscribe to the feed(s).

When I have time, I’m probably going to start generating the RSS myself, rather then using PHP iCalendar, partly to get a better feel for RSS, and partly so I can start including better filtering options in the future.

Posted in  | 2 comments

Book RSS updated

Posted by Scott Laird Tue, 04 Nov 2003 01:04:06 GMT

The RSS feed of bookstore author events from PHP iCalendar was broken for University Books because of an entity replacement problem. I added é to the list of translated characters and everything works a bit better now.

Posted in  | no comments

HTML screen-scraping in Ruby

Posted by Scott Laird Sat, 01 Nov 2003 20:54:13 GMT

My little author reading project is written in Ruby, my current scripting-language-of-choice.

Here’s a example of what it takes to grab web pages and extract content from them:

client=HTTPAccess2::Client.new
url="http://www.elliottbaybook.com/..."
parser = HTMLTree::XMLParser.new(false,false)
parser.feed(client.getContent(url))
xml=parser.document

xml.elements.each('//p[@class="small"]') do |node|
  event=BookEvent.new
  event.store="Elliott Bay Book Company"
  event.location="Elliott Bay Book Company"
  event.time=node.to_s.gsub(/<\<[^>]+>/,'')
  event.author=node.elements['./a[1]/b[1]'].text rescue nil
  event.title=nil
  event.note=node.elements['.'].to_s rescue ''

  next unless event.time and event.author

  if event.note =~ / at [0-9].* at ([^<>]*)/
    event.location=$1
  end

  event.time=BookTime.new_from_string(event.time)
  next unless event.time

  books.push(event)
end

The interesting bit is probably xml=parser.document; that’s where Ruby’s HTML parser hands its parse tree off to Ruby’s XML engine, REXML. This lets me use REXML’s XPATH engine for searching through the HTML mess that most bookstores use on their web sites. In this case, all author reading events are inside of <p class=”small”> tags, so I iterate through all of the matching tags and try to create a BookEvent object from each. The author name comes from a <a><b> block inside of the <p> block, and the time and location are extracted via regular expressions.

If book stores had decent web pages, this’d be really easy, but as it is, I had to apply a few heuristics and flat out guess at times, and I’ll have to revisit the code every time they reformat their web sites. But, Ruby worked out really well this time.

Posted in ,  | 6 comments

More bookstore calendar updates

Posted by Scott Laird Sat, 01 Nov 2003 20:29:44 GMT

I’m still playing with the book store event calendar that I was talking about a day or two ago. I’ve cleaned up the code a little bit, split each bookstore into its own iCalendar file, reorganized the /books directory on scottstuff.net, and installed PHP iCalendar. So, here’s where we stand:

  • You can go to http://scottstuff.net/books/ and see all of the events on a convenient calendar.
  • From the calendar, you can subscribe to each of the individual iCalendar files for each bookstore.
  • You can manually subscribe to the aggregate of all 4 bookstores via webcal://scottstuff.net/books/calendars/seattle.ics.
  • PHP iCalendar can provide an RSS feed for each individual calendar, but it doesn’t work very well for me right now; I’ll probably write an RSS feed directly when I have time.
  • The calendar page looks totally different from the rest of the site. I started to adjust the CSS for it, but it’s a bigger job then I feel like tackling right now.

I’ll probably add Barnes and Noble sometime this weekend.

Posted in  | no comments

Seattle Book Tours

Posted by Scott Laird Fri, 31 Oct 2003 09:50:10 GMT

I’ve missed readings by several of my favorite authors over the past year, largely because I haven’t had a good way to track who’s coming to town when. So, being partially insane, I threw together a little Ruby script to extract author visit information from 4 local bookstores and turn it into an iCalendar file, suitable for iCal or Mozilla’s Calendar.

The stores are:

  • University Books
  • Third Place Books
  • Elliott Bay Book Company
  • Seattle Mystery Bookshop

There are a bunch of little things that I need to do to make this usable for people, but it’s almost 2:00 AM, and it works well enough for me. I’ll add a web-based iCalendar reader, add per-store iCalendar files, and maybe an RSS feed later. Oh yeah, and add it to cron. Can’t forget to add it to cron.

Update: I’ve made quite a few changes since I wrote this. See the category index for details.

Posted in  | no comments

Berke Breathed is in town

Posted by Scott Laird Fri, 31 Oct 2003 00:59:49 GMT

Argh! Why doesn’t anyone tell me these things! If I knew in advance, I could work it into my schedule.

Berkeley Breathed will be signing his latest book today at two locations: 3:30-5:30 p.m. at University Book Store, 4326 University Way N.E., Seattle, 206-545-4361; and 7 p.m. at Third Place Books, 17171 Bothell Way N.E., Lake Forest Park, 206-366-3333 [Seattle Times]

Seriously, I’d love to have either an RSS feed of author reading/signing events or an iCalendar that I could point iCal at. If either exists, Google can’t find it.

Posted in  | no comments