Tiger: Safari saves extra spotlight metadata

Posted by Scott Laird Sat, 07 May 2005 03:12:22 GMT

According to a post on 0xDECAFBAD, Safari records a bit of extra metadata onto every file that you download:

$ mdls QS.2AD4.dmg 
QS.2AD4.dmg -------------
kMDItemAttributeChangeDate     = 2005-05-06 12:40:37 -0700
...
kMDItemWhereFroms              = (
    "http://imajes.info/quicksilver/application/QS.2AD4.dmg", 
    "http://quicksilver.blacktree.com/"
)

Notice the kMDItemWhereFroms entry–that’s Safari tagging the file to remind you where it came from. I didn’t think that was possible in Tiger–Safari just downloaded the file, it doesn’t own it, and it doesn’t provide a Spotlight indexing driver for the file. Somehow, it still managed to add extra metadata of its own onto the file; I though we were going to have to wait for 10.5 before we could do this.

Unfortunately, the extra tags are fragile. I made two copies (one using the Finder, one using cp), and neither copy has a kMDItemWhereFroms field. That’s what you’d expect if Safari cheated and stuck an extra field into Spotlight’s DB, but it’s not how a “real” metadata system would behave. Hopefully 10.5 will allow metadata to belong to the file itself, not just the Spotlight index, so that copying it will preserve the extra metadata.

Posted in  | Tags , , ,  | 1 comment

Tiger: what's next?

Posted by Scott Laird Mon, 02 May 2005 20:22:43 GMT

I’ve been thinking a bit about what Apple’s going to do as a followup for Tiger. Assuming that Apple sticks with an 18-month cycle for 10.5 (“Lion”?), then we’ll be getting our hands on it slightly before Longhorn’s release date. Given that, it’s pretty clear that Apple is already thinking about what they want in 10.5 so they can trump Microsoft’s Longhorn offerings in the media. More then any previous release of OS X, I expect this one to be flashy; if Tiger was really an attempt to revamp the developer core of OS X, then I expect 10.5 to focus on the user experience.

A few things that I expect to see:

  • Improved metadata support. Just like Jon Siracusa has been saying, metadata is important. With Spotlight, we now have a framework for searching and storing file metadata. With 10.5, I expect to see the ability to tag files with specific metadata tags, like project name, priority, status, and so on. This is a feature that was pulled from Longhorn because of time constraints; Apple should be able to jump way ahead of Windows without a lot of work in this area.

  • Networkable Spotlight. Right now, Spotlight only works on local filesystems. Adding network support for Spotlight searching (and all of the other metadata that goes with it) will finally give people a decent reason to buy OS X servers.

  • A Finder replacement. Once we have the new metadata engine, we’re going to need a UI for managing it. I’m not sure if Apple will simply provide us with “Finder 2.0,” or if they’ll produce a radical new file manager and then provide users with the ability to revert to the old finder if they don’t want to upgrade. Either way, if Apple finishes their metadata back-end, then the Finder really needs to be updated to match. Since Apple hasn’t really put much work into the Finder since 10.0, and there are boatloads of customer complaints about the current Finder, I think it’s time for it to get re-written from the ground up, with the new version centered around metadata and searching.

  • More syncing and disconnected support. Windows has supported disconnected operation on network shares for years. That way, you can have a network share for some project, but have Windows maintain a local cache so you can still edit documents while you’re on a plane or otherwise unable to connect to the network. It’s a great feature, and Apple doesn’t have anything even vaguely like this right now.

  • Full 64-bit support. With Tiger, Apple has some support building 64-bit applications, but you can’t use any of the UI frameworks from 64-bit apps. I fully expect 10.5 to provide equal support for all current APIs from 32- or 64-bit code, and I expect that Xcode will have simple support for building “fat” 32/64 applications. If Apple leaves 64-bit apps out in the cold, then they’re giving Microsoft a big opening, and we’ll see claims that OS X isn’t “a real 64-bit operating system.” By 2007, at least 80% of new PCs and Macs will come with 64-bit CPUs; even if most apps won’t need to support 64-bit address spaces, it’d be a major mistake not to give developers the tools that they need to build 64-bit apps.

  • Resolution-independant displays. Right now, most UI elements are sized in pixels; that works great when all monitors have roughly the same DPI, but it falls down horribly when displays range from 72 to 300 DPI. Switching from a 100 DPI display to a 200 DPI display currently means that all of your UI elements (menu bar, window decorations, icons) shrink to half of their current size. Tiger has the ability to change this hidden inside the developer tools, but most apps have a hard time dealing with it right now; by the time 10.5 rolls around, I expect the OS to automatically adjust the UI size based on the DPI of the output device; this will let us have HD-resolution displays on 15-inch PowerBooks without needing to ship a magnifying glass with every new laptop.

Posted in  | Tags , , , ,  | 2 comments

Tiger: Spotlight and EXIF

Posted by Scott Laird Sat, 30 Apr 2005 08:49:53 GMT

According to Apple, Spotlight is supposed to be able to index EXIF data. Unfortunately, none of my JPEGs’ EXIF data has been indexed. Even worse, some of the Spotlight metadata fields are obviously bogus, showing obvious bugs in Spotlight’s implementation.

To test Spotlight’s EXIF abilities, I used a JPEG from my D60 that I had annotated with EXIF and IPTC data using iView Media Pro. Using the mdls command-line tool, I asked for details on one of my images, and saw something like this:

halloween-2004-173.JPG -------------
kMDItemAttributeChangeDate = 1970-01-02 17:40:06 -0800
kMDItemFSContentChangeDate = 2004-10-30 22:43:23 -0700
kMDItemFSCreationDate  = 2004-10-30 22:43:23 -0700
kMDItemFSCreatorCode   = 0
kMDItemFSFinderFlags   = 0
kMDItemFSInvisible     = 0
kMDItemFSLabel         = 0
kMDItemFSName          = "halloween-2004-173.JPG"
kMDItemFSNodeCount     = 0
kMDItemFSOwnerGroupID  = 20
kMDItemFSOwnerUserID   = 501
kMDItemFSSize          = 1712816
kMDItemFSTypeCode      = 0
kMDItemID              = 3341125
kMDItemLastUsedDate    = 2004-10-30 21:43:23 -0700
kMDItemUsedDates       = (2004-10-30 21:43:23 -0700)

Notice that all of the metadata provided is simply generic file data–none of this is image-specific. There is no EXIF data, and no indication that Spotlight knows that this is an image. Also, the kMDItemAttributeChangeDate is weird–it’s not impossible that I’d lost my clock and not noticed for two days, but it’s unlikely.

Worse, after making a minor change to the JPEG and then waiting for the re-indexer to run, the Attribute Change Date became completely bogus, changing nearly every time I ran mdls. Here are a few sample values:

kMDItemAttributeChangeDate = 1969-12-31 16:00:06 -0800
kMDItemAttributeChangeDate = 125748-10-16 05:23:09 -0800
kMDItemAttributeChangeDate = 1976-10-14 07:29:24 -0700

I particularly like the way that it jumps from 6 seconds after the Unix epoch to 123000 years into the future and then back into the 70’s again.

Even after telling iView Media Pro to re-write all of the EXIF and IPTC data in the file, mdls still doesn’t show any EXIF details. This is disappointing–I’d hoped to be able use Spotlight to locate images in my photo collection, possibly searching by date, time, focal length, lens, location, subject, and so on.

As I see it, one of two things can be happening here. Either this is a bug in 10.4.0 or Spotlight’s indexing system works in two passes–first indexing files’ existence, and then looping back around to index the data inside the file. It looks like mdimport is still running on something on my box, even though the Spotlight dropdown claims to be finished.

Update: I think it’s running in two passes. If I manually run mdimport on my JPEG, then a bunch of EXIF data shows up in mdls:

halloween-2004-173.JPG -------------
kMDItemAcquisitionMake     = "Canon"
kMDItemAcquisitionModel    = "Canon EOS D60"
kMDItemAperture            = 6
kMDItemAttributeChangeDate = 2005-04-29 18:49:35 -0700
kMDItemBitsPerSample       = 32
kMDItemCity                = "Woodinville"
kMDItemColorSpace          = "RGB"
kMDItemContentCreationDate = 1903-12-31 16:00:00 -0800
kMDItemContentModificationDate = 2004-10-30 22:43:23 -0700
kMDItemContentType         = "public.jpeg"
kMDItemContentTypeTree     = ("public.jpeg", "public.image", "public.data", "public.item", "public.content")
kMDItemCopyright           = "2004 Scott Laird"
kMDItemCountry             = "USA"
kMDItemDisplayName         = "halloween-2004-173.JPG"
kMDItemEXIFVersion         = "2.2"
kMDItemExposureMode        = 1
kMDItemExposureTimeSeconds = 0.005
kMDItemFlashOnOff          = 1
kMDItemFocalLength         = 23
kMDItemFSContentChangeDate = 2004-10-30 22:43:23 -0700
kMDItemFSCreationDate      = 2004-10-30 22:43:23 -0700
kMDItemFSCreatorCode       = 0
kMDItemFSFinderFlags       = 0
kMDItemFSInvisible         = 0
kMDItemFSLabel             = 0
kMDItemFSName              = "halloween-2004-173.JPG"
kMDItemFSNodeCount         = 0
kMDItemFSOwnerGroupID      = 20
kMDItemFSOwnerUserID       = 501
kMDItemFSSize              = 1712816
kMDItemFSTypeCode          = 0
kMDItemHasAlphaChannel     = 0
kMDItemID                  = 3341125
kMDItemISOSpeed            = 7.64386
kMDItemKind                = "JPEG Image"
kMDItemLastUsedDate        = 2004-10-30 22:43:23 -0700
kMDItemPixelHeight         = 3072
kMDItemPixelWidth          = 2048
kMDItemRedEyeOnOff         = 0
kMDItemResolutionHeightDPI = 180
kMDItemResolutionWidthDPI  = 180
kMDItemStateOrProvince     = "WA"
kMDItemUsedDates           = (2004-10-30 22:43:23 -0700)
kMDItemWhiteBalance        = 1

This still isn’t perfect, though–I’m not quite sure where kMDItemISOSpeed = 7.64386 came from–it’s really ISO 100. Similarly, mdls reports the Aperture as 6 when it should be f/8. On the other hand, the focal length and shutter speed are correct, although it’s sort of weird to see shutter speed expressed as decimal seconds.

Update 2: It’s definitely a two-pass thing. After leaving my laptop running overnight, all of my JPEGs now have a full set of metadata associated with them. The ISO numbers and apertures are still wrong, but the rest of the data’s all there. So, it looks like Spotlight tries really hard to get basic data into the DB, and then makes a second pass filling in more details as time permits. Good to know–I haven’t seen this documented anywhere.

Posted in  | Tags , , ,  | 7 comments

Tiger stuff

Posted by Scott Laird Mon, 28 Jun 2004 20:48:15 GMT

A few notes on Tiger:

  • Ooooh. Search. Search has been one of my big things lately, and I like what I’ve seen of Tiger so far, but it’s too early to tell how well it’ll work. Fundamentally, searching seems to scale better then strict hierarchical organization. For instance, with a good search tool, it’s faster to search through the 100,000 or so old email message that I have laying around then it’d be to change folders and skim through a couple dozen messages by hand. The big problem is that search tends to be resource-intensive–I’ve been playing with Zoe, QuckSilver, and HistoryHound, and they each end up wanting over 100 MB of RAM. Fundamentally, there’s no real need for this, and we’ll see how Apple does with Tiger. I’m hopeful, but I’m used to disappointment. Specifically, I want to see what Mail.app lets me do with smart folders; Can I tag messages with tags like ‘Important’ or ‘To-Do List’ and get smart folders that show me all of the ‘To-Do List’ items? There’s no real indication that Apple is going to let us add generic metadata, and that’s a pity; it’ll have to wait for ‘Lion’ or ‘Tabby’, or whatever comes after Tiger.

  • 64-bit application support. This isn’t a huge thing for most people today, but for some types of applications, it’s utterly critical. Anything that wants to use more then 4 GB of RAM needs it, and it starts getting useful around 1 GB, generally. It’s the way of the future, and it’s nice to see that it’s showing up now; in another two years, it’s going to be important to all of us.

  • cp understands Mac OS resource forks. Finally. Files are files; the fact that copying Mac-specific files with Unix tools tended to destroy bits of them was kind of irritating.

  • Safari has an RSS reader. After watching Apple’s RSS movie, I’m not really sure about this one–it’s a neat feature, but it pales in comparison to NewNewsWire. Frankly, RSS belongs in Mail, not Safari.

  • Real-time video effects using the GPU. Cool, but not terrifically useful to me, particularly with my underpowered PowerBook 550.

  • iSync SDK. ABOUT FSCKING TIME. The Zaurus people have been trying to write an iSync plugin for years, but haven’t had any documentation. Personally, I’d love to see what happens one you graft bits of MultiSync into iSync–you should end up with free calendar and address book synchronization between Macs, Linux systems, PocketPCs, and whatever else MultiSync supports now.

  • iChat supports conferencing. Yeah, but does it support non-AIM SIP servers? It’s totally unusable for me right now, between generic NAT problems and Asterisk wanting port 5060 on my firewall. It’d be really nice if I could use iChat as a softphone with Asterisk.

Apparently, it’s all shipping in 1H2005, or up to a year away. It’ll probably end up being February-ish, if they follow their Jaguar/Panther shipping trend. That’s a long time to wait for the handful of features that I’d really like to see (mostly the Spotlight search tools), and as always, there’s the $129 question–is the upgrade really worth it?

Posted in  | Tags , , , ,  | no comments