computer science goes bonk / all posts / rss / about

debugging managed code with windbg

Using windbg to analyze crash dumps of managed code (i.e. C#) from remote machines can be a bit of a pain. This post explains one way to fix a common problem.

Problem

If you don't have versions of certain files that match those of the machine that generated the crash dump, you can get the dreaded data access error:

Failed to load data access DLL, 0x80004005
Verify that 1) you have a recent build of the debugger (6.2.14 or newer)
            2) the file mscordacwks.dll that matches your version of mscorwks.dll is 
                in the version directory
            3) or, if you are debugging a dump file, verify that the file 
                mscordacwks___.dll is on your symbol path.
            4) you are debugging on the same architecture as the dump file.
                For example, an IA64 dump file must be debugged on an IA64
                machine.

You can also run the debugger command .cordll to control the debugger's
load of mscordacwks.dll.  .cordll -ve -u -l will do a verbose reload.
If that succeeds, the SOS command should work on retry.

If you are debugging a minidump, you need to make sure that your executable
path is pointing to mscorwks.dll as well.

Solution

  1. Get the right versions of mscordacwks.dll, mscorwks.dll, sos.dll, and mscorlib.dll
  2. Put those files somewhere accessible to windbg
  3. Profit!

Explanation

Let's go through those steps in more detail. Say we load a crash dump in windbg and notice that a managed exception seems to be the culprit:

This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(3ba4.481c): CLR exception - code e0434f4d (first/second chance not available)
eax=1860f220 ebx=6d670000 ecx=00000006 edx=0000006c esi=17c1efa8 edi=00001054
eip=76eb6344 esp=17c1e590 ebp=17c1e5a0 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
ntdll!KiFastSystemCallRet:
76eb6344 c3              ret
0:039> .loadby sos mscorwks
------------------------------------------------------------
sos.dll needs a full memory dump for complete functionality.
You can create one with .dump /ma 
------------------------------------------------------------
0:039> !pe
Failed to load data access DLL, 0x80004005
Verify that 1) you have a recent build of the debugger (6.2.14 or newer)
            2) the file mscordacwks.dll that matches your version of mscorwks.dll is 
                in the version directory
            3) or, if you are debugging a dump file, verify that the file 
                mscordacwks___.dll is on your symbol path.
            4) you are debugging on the same architecture as the dump file.
                For example, an IA64 dump file must be debugged on an IA64
                machine.

You can also run the debugger command .cordll to control the debugger's
load of mscordacwks.dll.  .cordll -ve -u -l will do a verbose reload.
If that succeeds, the SOS command should work on retry.

If you are debugging a minidump, you need to make sure that your executable
path is pointing to mscorwks.dll as well.

We try to load the SOS extension to do managed debugging (.loadby sos mscorwks) and then display the exception (!pe), but that doesn't work. We're missing the correct version of some files.

What version do we need?

0:039> lm vm mscorwks
start    end        module name
5f9f0000 5ff9a000   mscorwks T (pdb symbols) c:\symbols\mscorwks.pdb\36D14CEA6C094DB484DC48D9D2B53C732\mscorwks.pdb
    Loaded symbol image file: mscorwks.dll
    Image path: C:\Windows\Microsoft.NET\Framework\v2.0.50727\mscorwks.dll
    Image name: mscorwks.dll
    Timestamp:        Fri Mar 25 00:07:24 2011 (4D8C14FC)
    CheckSum:         005B2511
    ImageSize:        005AA000
    File version:     2.0.50727.4961
    Product version:  2.0.50727.4961
    File flags:       0 (Mask 3F)
    File OS:          4 Unknown Win32
    File type:        2.0 Dll
    File date:        00000000.00000000
    Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4

Here we use the lm command (lm vm mscorwks) to view information in the dump about what version of mscorwks.dll was loaded. Note that the version it's looking for is 2.0.50727.4961, which doesn't match what we have installed locally.

Our goal now is to find version 2.0.50727.4961 of mscordacwks.dll, mscorwks.dll, and sos.dll.

How do we get those files?

When in doubt, search for it. The first hit for "mscorwks 2.0.50727.4961" on Google is a security update for the .NET framework.

Following the links and downloading the x86 version of the hotfix (because our crash dump came from an x86 box) gives us a .msu file. Great. What do we do with a .msu file?

Extracting files from a .msu file

We don't really want to apply this hotfix, or we might not be able to because of peculiarities of the machine that we're on. Fortunately, we can use the expand command from the shell to extract the contents from this .msu file without installing anything (note: MSIX can be useful in extracting the contents of a .msp file, if you get that instead of a .msu file):

C:\tmp> mkdir output
C:\tmp> expand -F:* Windows6.1-KB2518867-x86.msu output

That just gives us some .cab files, but fortunately we can use expand on .cab files as well. Eventually, we get a bunch files and directories that include the following directories:

...
x86_netfx-mscordacwks_b03f5f7f11d50a3a_6.1.7600.16789_none_ffa869efc3305c62
x86_netfx-mscorwks_dll_b03f5f7f11d50a3a_6.1.7600.16789_none_06e44708eb2f508f
x86_netfx-sos_dll_b03f5f7f11d50a3a_6.1.7600.16789_none_e876b1b0b7253877
...

Those directories contain versions 2.0.50727.4961 of mscordacwks.dll, mscorwks.dll, and sos.dll. We can check the properties of those files to make sure they're the right ones:

mscordacwks_properties

(Aside: if the hotfix is an .exe instead of a .msu, you can probably start the installer for the patch without finishing it, figure out where the installer has temporarily extracted its payload -- perhaps in C:\3257234982623<some big number>\, or use procmon to watch where it writes stuff to -- and then grab the appropriate files.)

What do we do with these files?

  • Make a directory (e.g. C:\dlls\) and add that directory to your symbol path in windbg.
  • For mscordacwks.dll, just rename to include the architecture and version and put it in the base directory:
    C:\dlls\mscordacwks_x86_x86_2.0.50727.4961.dll
    
  • For mscorwks.dll and sos.dll, it's a little more complicated, because they go in a subdirectory that includes a hash based on mscorwks.dll:
    C:\dlls\mscorwks.dll\<hash>\
    

What the heck is the right <hash>?

There are several ways to figure out what that <hash> is. You could use symstore or compute it directly, as it's some combination of image size/date/etc. But a simpler way is just to look at where windbg is trying to load mscorwks.dll from:

0:039> !sym noisy
noisy mode - symbol prompts on
0:039> .reload /f mscorwks.dll
SYMSRV:  c:\symbols\mscorwks.dll\4D8C14FC5aa000\mscorwks.dll not found
SYMSRV:  mscorwks.dll from http://msdl.microsoft.com/downloads/symbols: 2321153 bytes - copied         
DBGHELP: C:\Program Files (x86)\Debugging Tools for Windows (x86)\sym\mscorwks.dll\4D8C14FC5aa000\mscorwks.dll - mismatched
DBGENG:  C:\Windows\Microsoft.NET\Framework\v2.0.50727\mscorwks.dll image header does not match memory image header.
DBGENG:  C:\Windows\Microsoft.NET\Framework\v2.0.50727\mscorwks.dll - Couldn't map image from disk.
Unable to load image C:\Windows\Microsoft.NET\Framework\v2.0.50727\mscorwks.dll, Win32 error 0n2
DBGENG:  mscorwks.dll - Partial symbol image load missing image info
DBGHELP: Module is not fully loaded into memory.
DBGHELP: Searching for symbols using debugger-provided data.

Here we tell windbg to be more verbose about where it's trying to load stuff from (!sym noisy) and then tell it to forcibly reload mscorwks.dll (.reload /f mscorwks.dll). The <hash> that we're looking for is that 14-digit hex number: 4D8C14FC5aa000.

Now that we know what <hash> is, we know where to put mscorwks.dll and sos.dll:

C:\dlls\mscorwks.dll\4D8C14FC5aa000\mscorwks.dll
C:\dlls\mscorwks.dll\4D8C14FC5aa000\sos.dll

Second time's a charm

Now that we've got all that set up, let's restart windbg and tackle that crash dump again:

This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(3ba4.481c): CLR exception - code e0434f4d (first/second chance not available)
eax=1860f220 ebx=6d670000 ecx=00000006 edx=0000006c esi=17c1efa8 edi=00001054
eip=76eb6344 esp=17c1e590 ebp=17c1e5a0 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
ntdll!KiFastSystemCallRet:
76eb6344 c3              ret
0:039> .loadby sos mscorwks
------------------------------------------------------------
sos.dll needs a full memory dump for complete functionality.
You can create one with .dump /ma 
------------------------------------------------------------
0:039> !pe
Not a valid exception object

Okay, that's progress -- we don't get that data access error anymore. But if we have a CLR exception, why is !pe misbehaving?

0:039> !sym noisy
noisy mode - symbol prompts on
0:039> !pe
...
SYMSRV:  c:\symbols\mscorlib.dll\4D8C159945a000\mscorlib.dll not found
SYMSRV:  mscorlib.dll from http://msdl.microsoft.com/downloads/symbols: 1343977 bytes - copied         
DBGHELP: C:\Program Files (x86)\Debugging Tools for Windows (x86)\sym\mscorlib.dll\4D8C159945a000\mscorlib.dll - mismatched
DBGENG:  C:\WINDOWS\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll image header does not match memory image header.
DBGENG:  C:\WINDOWS\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll - Couldn't map image from disk.
...
Not a valid exception object

It looks like windbg is having trouble finding the right version of mscorlib.dll. Fortunately, the version we want should be in that hotfix we just downloaded. We can use the same process as above to find the version 2.0.50727.4961 of mscorlib.dll. Like mscorwks.dll, we'll stick it in a specially-named subdirectory that includes the hash we see above -- 4D8C159945a000:

C:\dlls\mscorlib.dll\4D8C159945a000\mscorlib.dll

Third time's a charm

Now that we've got the right version of mscorlib.dll, let's try it again:

This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(3ba4.481c): CLR exception - code e0434f4d (first/second chance not available)
eax=1860f220 ebx=6d670000 ecx=00000006 edx=0000006c esi=17c1efa8 edi=00001054
eip=76eb6344 esp=17c1e590 ebp=17c1e5a0 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
ntdll!KiFastSystemCallRet:
76eb6344 c3              ret
0:039> .loadby sos mscorwks
------------------------------------------------------------
sos.dll needs a full memory dump for complete functionality.
You can create one with .dump /ma 
------------------------------------------------------------
0:039> !pe
Exception object: 21a5bfb8
Exception type: System.NotSupportedException
Message: This type of CollectionView does not support changes to its SourceCollection from a thread different from the Dispatcher thread.
InnerException: 
StackTrace (generated):

StackTraceString: 
HResult: 80131515

Yay!