I had a program that was crashing at the customer site, but not in any way I could reproduce back at the office. I wasn’t able to load up a dev environment on the affected box, but someone had the bright suggestion of running Dr Watson (DrWtsn32.exe), and checking out the stack trace. As we all know, if you know your codebase well enough, you can almost explain a crash just by where it happened.
So I created an exe with debug symbols and optimizations removed, and waited for the crash. The DrWtsn32.log contained, amongst other things, the dump of the thread that caused the program crash:
*----> Stack Back Trace < ----*
FramePtr ReturnAd Param#1 Param#2 Param#3 Param#4 Function Name
0100F9E8 00438172 01010460 0100FBE0 0100FB00 00D85698 !<nosymbols>
0100FAF4 0048DD45 01010460 0100FC54 0100FBFC 00D85698 !<nosymbols>
0100FBEC 004C3DBE 0106D028 00000114 0100FC60 0100FC64 !<nosymbols>
0100FC54 004C3C17 00D8AFC0 00000114 0100FD60 0100FD70 !<nosymbols>
0100FCDC 004CBDE9 010101F0 00000110 0100FDE0 0100FD70 !<nosymbols>
0100FD60 004C6FC5 010101F0 00000110 0100FE44 0100FDF0 !<nosymbols>
0100FDE0 004B63BD 010101F0 00000110 0100FEEC 0100FF00 !<nosymbols>
0100FE50 004B61B6 010101F0 00000110 0100FF7C 0100FF00 !<nosymbols>
0100FEEC 004B5C9C 010101F0 00000110 00000001 000001FB !<nosymbols>
0100FF7C 1020BFD2 00D84058 000001FB 00130178 00D85698 !<nosymbols>
0100FFB4 77E8B2D8 00D85698 000001FB 00130178 00D85698 !beginthreadex
0100FFEC 00000000 1020BF20 00D85698 00000000 00000008 kernel32!lstrcmpiW
Which added nothing to my understanding of what went wrong, beyond it happened at some point after the thread started. Great. Did I mention I’d done a debug build – the kind with all the symbols compiled in? Anyway, I figured switching to the binary output (which I understand is like a unix core file) may provide further explaination.
The format of the binary output from Dr Watson is lost in the mists of time. I had to go to an NT4 server install CD (you have to love the pack-rats who keep the German Server version of NT4 ©1999!) to locate the two necessary files for working with the user.dmp file it generates. The files are DUMPCHK.EXE and DUMPEXAM.EXE.
Running DUMPCHK.EXE gives a little output
C:\...\RetailGateway>dumpchk user.dmp
Filename . . . . . . .user.dmp
Signature. . . . . . .USER
ValidDump. . . . . . .DUMP
MajorVersion . . . . .5
MinorVersion . . . . .0
DirectoryTableBase . .0x0000014c
PfnDataBase. . . . . .0x00000004
PsLoadedModuleList . .0x00000018
PsActiveProcessHead. .0x0000009a
MachineImageType . . .NumberProcessors . . .7528
BugCheckCode . . . . .0x00001f65
BugCheckParameter1 . .0x000000a0
BugCheckParameter2 . .0x00000040
BugCheckParameter3 . .0x00001ca8
BugCheckParameter4 . .0x00d6df65
ExceptionCode. . . . .0xc0000005
ExceptionFlags . . . .0x00000000
ExceptionAddress . . .0x0045b4a9
ExceptionParam#0 . .0x00000000
ExceptionParam#0 . .0x00000000
At which point I got this dialog:
“Oh well,” I thought, “I’ll just head on straight to examining the dump, rather that checking it.” Unfortunately, DUMPEXAM.EXE wasn’t on my side. All I got was:
C:\...\RetailGateway>dumpexam user.dmp
unsupported processor type
Which is useless. Which is the sum value of the binary dump from Dr Watson. And, in this case, all of the output from Dr Watson.