Debugging without line numbers

A few days ago, someone asked me for some help tracking down a problem. While the problem itself isn’t interesting here, the techniques might be interesting to the Mac programmers out there.

With function names sanitized, here’s the question:

I have a crash log that looks like:

Thread 4 Crashed:
 #0   0x00000000 in 0x0
 #1   0x003eecf8 in FunctionOne (FunctionOne + 268)
 #2   0x003eeddc in FunctionTwo (FunctionTwo + 88)
 #3   0x003eb714 in 0x003eb714
 #4   0x90a3a168 in forkThreadForFunction (forkThreadForFunction + 108)
 #5   0x900247e8 in _pthread_body (_pthread_body + 40)

I have the source code that was used to build the binary. How can I translate the 268 offset in FunctionOne() above into an actual line number so I can tell where the exact crash is?

Rather than translate the offset into a line number, I tracked down the bug in the code. While having line numbers can make tracking down bugs a lot easier, they’re not always necessary if you know what to do.

I started by finding the binary in question and running otool -tV on it, piping the output to less. I searched for the definition of FunctionOne in the output and saw this:

...
_FunctionOne:
00013bec        cmpwi   cr7,r3,0x0
00013bf0        mfspr   r0,lr
...

We’re looking for offset 268, so I added 268 to 0×00013bec to get 0×00013cf8. Frame 1 in the backtrace is somewhere around there (plus or minus an instruction, depending on how addresses in backtraces are reported). Looking at the otool output again, I scrolled forward to 0×00013cf8 and saw this:

...
00013cf0        li      r6,0x0
00013cf4        bl      0xc860  ; symbol stub for: _LibraryFunction
00013cf8        or      r3,r30,r30
00013cfc        bl      0xc580  ; symbol stub for: _CFRelease
00013d00        or      r3,r29,r29
...

Since the backtrace included one stack frame past FunctionOne, I figured we must be inside a function call from FunctionOne. Chances are that the real location of a function is one instruction before its first actual instruction, so we’re actually at 0×00013cf4 instead of 0×00013cf8. That means we crashed inside LibraryFunction.

Fortunately, I had the code for that library handy. I took a look at the code for LibraryFunction and saw this:

void LibraryFunction(void *arg1, int arg2,
	             MyCallbackFn callback, void *arg4) {
	int status = DoSomething(arg1, arg2, arg4);
	if (status != noErr) {
		(callback)(arg1, arg4);
	}
}

Think about what happens if callback is NULL. If we’re unlucky and DoSomething fails, we end up calling through a NULL function pointer, and we’d actually end up with a crash looking very similar to the one in the backtrace with NULL as the last frame on the stack.

So far, so good…but why was callback NULL? I took a look at the original binary and saw this line at about the right place in FunctionOne:

	LibraryFunction(foo, 0, NULL, NULL);

And there’s the bug. Once the third argument was changed to be a real callback function, everything worked.

That’s my quick lesson for the day on tracking down bugs from crash logs for binaries that don’t include full debugging information. Hopefully it’ll be useful for somebody.

1 Comment

  1. Buzz Andersen Said,

    January 11, 2004 @ 8:18 am

    Wow–cool! Very useful, thanks!

RSS feed for comments on this post