Archive forSeptember, 2003

Two Synthesist essays

David Stutz has two new essays up on his site, one about software business models and the other about patents. Why didn’t anyone tell me?

Oh, well. I’ll just have to convince him to add an RSS feed or email announcements list eventually, I suppose.

Comments off

When “slower” doesn’t necessarily mean slower

Last week, when I had a bit of free time (well, as free as time gets when you’re looking for something interesting to do at 1 a.m.), I randomly glanced at a bug report somebody had filed saying that strstr was slow on Mac OS X.

This post isn’t about strstr performance, as interesting as that would be. Instead, it’s about how people (OK, programs) decide that things are slow.

The strstr bug report was filed because someone was trying to build a copy of procmail on Mac OS X. procmail has been around for a long time, and originally it ran on systems far less powerful than the slowest system supported by Mac OS X (a 233MHz iMac, for those keeping score at home). A long time ago (1994, actually, according to the procmail history file), the procmail developers apparently noticed that procmail’s performance was very dependent on the performance of the system’s strstr API. They sat down, wrote their own that they believed to be much faster, and all was good in the world.

They also added a test to their configure script to see if their strstr is faster than the one on the local system. If the local one is faster, they smartly use that one instead. It’s a nice bit of forward thinking — perhaps someday someone would write a better strstr, and they’d be ready.

The configure script prints out something like “Your system’s strstr is 1.20 times SLOWER than ours” or “Your system’s strstr is 1.60 times FASTER than ours”. The bug was filed because the person building procmail noticed that it printed out the “slower” line and figured that we probably needed a better strstr.

So far, so good. A quick glance shows that at least one other system found the procmail work to be useful — glibc’s strstr implementation is copied from procmail.

Wondering what about procmail’s implementation made it faster than Mac OS X’s strstr, I sat down to look at procmail’s configure test to see what it was doing. I figured it was probably doing a series of tests of strings with different lengths, including some search strings that don’t occur in the target at all, some that are at the beginning, some that are at the end, and so on.

Not so. procmail’s test builds a large block consisting of the letter ‘a’ over and over again with a few carriage returns thrown in, and then searches it for the string “From:\n”. It then tries that test a number of times with the system strstr and a single time with the procmail strstr.

Two interesting results jumped out here. First, the test is simply a single variant — a large target string and a short search string whose first letter doesn’t show up anywhere in the target string. That’s basically pointless for most strstr usage, including most of procmail’s typical usage. In fact, it means that about 90% of the code in the procmail strstr implementation isn’t even invoked as part of the test. Second, I was having trouble figuring out how the tests were run so I tried a little trick. I made a second (unmodified) copy of the procmail strstr implementation and modified the script to use that copy instead of the system strstr. The first copy won almost every time. Then I reversed the two, so my copy ran second and the first copy was used instead of the system strstr. Now my copy won almost every time.

In other words, the test doesn’t test anything useful, and even if the results were useful, they’re very dependent on the order in which the strstr implementations are run (at least on Mac OS X on my PowerBook). In other other words, the configure script doesn’t actually tell us which implementation is faster. All I could conclude from all of this was that the test was broken. From this test alone, it’s simply impossible to reach any conclusion about the performance of Mac OS X’s strstr…or of procmail’s strstr.

I guess the takeaway points are these: If you’re doing a test that compares the performance of two algorithms, be sure to use a data set that is at least vaguely representative of what the algorithms would see when run as part of your code. And make sure that your test doesn’t return different results when you change the order in which the algorithms are used. (I’d call that a “stable test”, like a stable sort, but I’m not sure if that’s a term already in use elsewhere.) And if you have a free minute and enjoy reading really complicated C code, go read the procmail strstr implementation. Performance aside, it’s pretty nifty.

Comments (1)

Cutest dog ever

I don’t like dogs. I don’t know why. They just always seem too big or small or hyper or uncontrolled, etc., etc., etc. I can’t actually think of a single dog I’ve met that I really liked.

Until tonight. Sha Sha and Mariel had a housewarming party tonight, so I got to meet Enzo. Oh, my goodness is he cute. Wow. If all dogs were like him, I’d love to have one.

I asked Sha Sha if I could take him home as a party favor and she said no. I’m disappointed, but I guess I understand. After all, if you had the cutest dog in the world, wouldn’t you want to keep him?

Comments (5)

On working for Apple, or how to mail lots of people without them knowing

Chuq writes about a note he received asking him politely about what it’s like to work for Apple and how to find a job there. People ask these questions all the time yet the answers don’t change very often, so it’s helpful to have his answers online.

But that’s not why I’m writing this. You see, Alexei and I received the same message last week (from the same person). Before I talked to Alexei, I figured it looked like a personal message to me, so I took some time — there are a lot of questions — and wrote a detailed response.

As soon as I found out that it was sent to multiple people, I was a bit annoyed. I’m not sure if I should be upset at that, though; after all, it’s just a college student trying to find a job at the company he loves, and it’s tough to fault him for that. Somehow I feel misled in a way, since at least three of us spent some amount of time (and I don’t know about Alexei, but both Chuq and I spent a lot of time) answering these questions, thinking that we were the only source of input.

After I sent my response, I got a polite note back from the student asking additional questions. I’m torn — do I take time away from the other things that I need to get done to answer, or is the aggravation of not being told that the note was bulk-mailed enough to keep me from helping him out further? I’m not sure.

I’m sure of one thing, though. If you’re looking for help from a group of people who know each other or work together, don’t mail everyone in the group and act like you’re mailing each of them individually. If anyone wants advice I’m happy to offer it — I’ve helped three people get jobs at Apple in the past five months, after all — but I’d rather the person asking be upfront about who they’ve asked. I don’t fault anyone for wanting multiple perspectives — that’s a great idea — but I think they should mention it.

Or maybe I’m just being childish.

(On a related note, this is one reason not to have a publicly available list of Apple bloggers. I wonder how many people bulk-mail all of the known Microsoft bloggers.)

Comments (9)

Connections again

A while back, I noted that I wouldn’t be the first person from my high school class to make The New York Times or Time.

Today, thanks to a post at How Appealing, I know that I won’t be the first in my elementary school class to be featured at The Smoking Gun. Matt Toll, who I last saw when I went to his birthday party shortly after we moved to another township at the start of sixth grade, wrote an interesting letter to the law firms where he’s hoping to be interviewed. Matt was quite a ruffian at times — I could tell stories that those firms might not want to hear, if they’re inclined to make judgments about a candidate on the basis of the candidate’s behavior when he was eight — but he was always really smart and driven to succeed. He’ll make a terrific lawyer, and from the looks of it, he’ll end up with a firm that has a sense of humor. I wish him luck….

Comments (1)

Speed

I spent much of today trying to track down an issue that required that I build Qt from Fink on Panther. It turns out that we couldn’t find anything wrong after all, but that’s not the interesting bit.

Both Fink and Qt take a while to build. I started the build on my 800MHz PowerBook G4. It has 512MB of RAM. It’s a very nice system. Many people would say that it’s fast.

About an hour into the build, with Fink done and Qt well under way, someone suggested that I try the build on a new dual 2GHz G5 that had just shown up. I left the PowerBook running and started to download the necessary bits onto the G5. I had some networking trouble there (not on our end), so it took me a few minutes to get all of the pieces in place. Then I started the Fink build. It finished astonishingly quickly. The PowerBook was still building Qt. I started the Qt build on the G5.

Just as I did that, I remembered that Qt wouldn’t install without X11, so I grabbed that and installed it onto the G5. The PowerBook was still building Qt.

About 45 minutes later, the G5 finished the build. The PowerBook was still building Qt. I think the PowerBook finished about an hour after that.

In other words, wow was that G5 fast. Sure, you’d expect it to be a lot faster based on the specs, but it’s one thing to read the specs and another to see it in person. Better yet, all of this was with only one processor in use because the Fink and Qt builds aren’t multi-threaded. Imagine what it could do with something that would actually use both processors. Wow.

Comments (3)

Comics!

Judi Sohn points to Tapestry, which is a collection of RSS feeds for comic strips. How cool is that? Better yet, it has feeds for my two favorites, Calvin & Hobbes and FoxTrot. There’s now a Comics group in my NetNewsWire subscriptions….

Comments (1)

Pictures in Mail

Dan Wood explains how to add a picture to your outgoing email so people who use Apple’s Mail to read their email will see the picture with each message they receive from you. At Apple, something’s configured on the internal network so this works automatically for anyone who wants to have a picture added to their email. It’s a nifty feature, especially because it works without adding an attachment or anything more than a few dozen bytes to the message.

Comments (4)

A more accurate history of American law

In response to Alabama Supreme Court justice Roy Moore’s crusade for the Ten Commandments and his supporters’ claim that they’re a foundation of the American legal system, Marci Hamilton explains the basic history of American law. In short, Moore et al. couldn’t be more wrong. Not that any of his supporters will pay attention to this, of course, but at least those of us who can think cogently about the issue will have a better idea of our country’s legal history after reading the article.

Comments off

Randomly finding an interesting site

A few days ago, Teresa Nielsen Hayden linked to Phluzein. Phluzein is an archeology weblog. It’s really, really good. I’ve always been interested in archeology, but news from the field only shows up in mainstream media every few months, it seems. The author of Phluzein posts a lot more often than that.

It’s really amazing how many niche web sites exist. If only there was some way to magically find all of the ones that I’d want to read….

Comments (1)

« Previous entries