Archive forMarch, 2004

Migrating my life to Mail and IMAP

I’ve been thinking of moving my mail out of Eudora for a long time now. With every passing year, Eudora gets only slightly better, while its flaws seem to become more noticeable. This weekend I finally made the move. Actually, I haven’t finished yet. Moving 340,000 messages while keeping metadata (replied-to markers and other such things) intact is a large undertaking, and I’m probably only about 30% done.

What finally spurred me to change was the server that a couple of us set up last month. Thanks to that, I can now get my mail via IMAP, which I’ve wanted for a long time. (I can also check it via webmail and SSH. Yes, I can have my cake and eat it too, as long as it’s an email cake.) Mail handles IMAP quite well and also satisfies most of the constraints I mentioned in my post a year ago. Also, Mail keeps getting better and better with every Mac OS X release.

The tricky part about migrating the past eight years of my life from POP in Eudora to IMAP in Mail is getting all of the details right. Moving the messages themselves would be easy — I’d just copy them onto the server and be done with it. But I want my metadata, nicknames, filters, and everything else. Thank goodness for Andreas Amann and his Eudora Mailbox Cleaner. I’m shocked that Eudora Mailbox Cleaner is free. I’d gladly give him $50 for the time it saved me.

That sort of took care of nicknames and filters — they still required some patching — and it managed to move my mail to Mail’s local storage, but it didn’t move the mail to my IMAP server. As I mentioned, I could’ve copied everything to the server, but I don’t know how Mail stores metadata so I don’t have any reason to think that scp’ing the mailbox will preserve that information. Instead, I’ve been dragging and dropping mailboxes one at a time. With a maximum upload speed of about 30 KB/sec. from my home connection — about four or five messages a second — this will take quite a long time to copy 340,000 messages. I think I’ll finish all of it sometime this week. Then I can get started on migrating this weblog.

Comments (7)

Birthday!

Last year on March 15th, I celebrated my birthday. By some strange coincidence, I’m having a birthday on the same day this year. (And still without Entenmann’s St. Patrick’s Day cupcakes, sadly.)

I turn 26 today. I’m not too sure what to think about that. A bunch of my friends are getting married and it seems like many of the rest are about to go back to school, so if I was going to judge myself by what my friends were up to I’d be very confused. I won’t do that, then.

I have to say that Amazon Wish Lists are great. Everyone’s who’s given me a gift so far has at least looked at my wish list, and the results have been three books — two from my list and one that I’m told is better than anything on my list. Looks like I have a lot of reading to do. (Speaking of which, I’m proud that I read some Real Literature this past week. I picked up John Steinbeck’s The Moon is Down after a trip to the National Steinbeck Center back in December. It’s a very good book.)

In case you’re curious, here are the books I’ve received so far:

I’ve also celebrated my birthday by contributing to a Roth IRA, registering a domain name, and finishing my taxes. I guess these 26th birthdays get celebrated somewhat differently than the ones I had in, say, elementary school.

Comments (13)

Wasting time on micro-optimizations

Richard Schaut writes about an appallingly stupid interview question — asking the candidate to explain what this line of code does:

a ^= b ^= a ^= b;

As Richard explains, the question is stupid because the expression is undefined. The resulting code may do anything from swap a and b to cause a butterfly’s wings to flap in China; either behavior would be correct.

That’s only sort of the point, though. You’d only write this line if you were trying to swap two variables and you figured that this would be faster than creating a temporary variable to store one in while you copied the other. Richard mentions that a modern compiler will optimize things well enough that the temporary variable solution may actually be faster, so you wouldn’t be helping things at all.

Even that’s not the point. The point is that a good programmer shouldn’t even think about optimizing something like this in the first place. Lots of people have their own favorite tricks — things like ++i being faster than i++, or macros being faster than function calls. Cute as those kinds of things may be, they sacrifice readability and provide an insignificant performance win.

If you’re coding for performance, you should figure out what’s slow before worrying about tweaking your code. Find the scenario that should be faster, measure it, and see what’s slow. I bet it won’t be variable copying. In most cases, it’ll either be algorithmic or architectural. Either you’re using an inefficient algorithm to accomplish a task that could be done faster in a different way or part of your application’s design forces you to do things slowly. Correcting those problems will give you far greater performance gains than any cascading series of xors ever could.

All of this is at the forefront of my mind right now because I’ve been spending some of my free time over the past few weeks helping out with a new version of an internal application that has some performance issues. Everyone knew it felt slow, but nobody sat down to measure it. I did, and the first thing that jumped out was that it spent far too long getting strings out of XML data. How long? Well, the algorithm it was using looked like this and was invoked many times for every new document:

 Walk through all children of the current node to count them
 for i = 1 to numChildren
 	Walk through all children until we get to child i
 	Get the text of child i

They had a function elsewhere that would apply another function to all children of a node. With that, rather than walking through all of the children and then repeating that again for every child, they could just do it once. I quickly wrote the code, measured it again, and ended up with a performance improvement of more than 10x for my test case. A document that previously took more than three minutes to open now opens in about fifteen seconds. You’d be hard-pressed to get a similar improvement from any sort of micro-optimization.

That isn’t to say that micro-optimizations are never useful. If you call a three-line function millions of times, your code will be faster if you inline the function instead of calling it as a real function. But you’ll probably get a much bigger win if you call the function half as often instead…and you shouldn’t waste your time at all unless you know that calling the function is affecting your application’s performance in the first place.

Comments (7)

Good timing, sort of

If you’re going to lose one — and everyone does, eventually — now is the best time left in the season to do it. Oh, well. Washington certainly played a better game.

Comments off