Visual Studio 2010 Beta 2 debugger may be confused by your symbol path

I’ve been evaluating Microsoft’s Visual Studio 2010 Beta 2 release recently on my Windows 7 x64 system. As can be expected, the beta has quite a few rough edges, but overall I like the new WPF-based IDE GUI and the refreshed code Editor in particular (I like how selecting a block of code now preserves syntax highlighting while the block is highlighted, for instance).

The new GUI does have some badness. I had particular distaste for the poor aliasing of tooltips in the editor. This can be seen, for example, when hovering over a function name like “printf”. The menu bar’s dark blue color scheme also appears rather peculiar.

Besides the “bling”, notable changes include the Visual C++ 2010 CRT reverting to the traditional deployment model used by the Visual C++ 2003 CRT. More specifically, the CRT DLLs no longer use SxS binding (“Fusion”) and are now simply deployed to the “system32” directory or to the application’s directory, as desired. Dropping SxS has some obvious disadvantages (SxS binding redirects would no longer be able to redirect applications that load a private copy of the CRT DLLs to updated versions with bug fixes and security updates) but presumably the pain of integrating SxS deployment into the setup process, which required either an MSI installation or pseudo-documented use of the SxS API, resulted in too much negative feedback and they chose to revert to the legacy approach.

Visual C++ projects are now built with MSBuild, like their C# and other .NET counterparts. This should have several benefits. One that comes to mind is the the Windows Installer XML Toolset’s Votive (its VS IDE integration component) should be able to support C++ projects as References in addition to .NET projects.

A more important update to the project system is support for multi-targeting. Most of Microsoft’s discussions on the subject mainly deal with said support for .NET projects, with the new Visual Studio being able to target .NET 2.0 through .NET 4.0 on a per-project basis. However, similar support is offered for native multi-targeting. A per-project setting specifies the “toolset” with which it is to be built. The product comes built-in with toolset definitions for the VC++ 10 and the VC++ 9 compilers, but since toolset definitions are simple XML files describing tool paths, older compilers and custom definitions are easy to define. Indeed, a toolset definition can be found for the Windows 7 SDK build environment. I foresee using this functionality to build user-space applications with headers and tools from the Windows Driver Kit build environment, resulting in being able to link with the OS CRT (msvcrt.dll) in a clean way, without modifying global Visual C++ directory settings, but rather keeping the changes contained to specific projects.

My enthusiasm for testing the product was struck a severe blow when my first attempt to run the Visual C++ debugger on a “Hello, World!” console application went awry. The IDE was hung for a good 15-20 seconds. The IDE sat frozen for quite a bit after F10 was pressed to initiate the debugging session, finally presented the console window for the test application and then spent some more time being frozen. After a lengthy wait, the session was finally ready.

But the worst part was that when the debugging session was finally ready, debugging symbols for all modules except the application .exe and the VC 10 CRT were NOT loaded! It appeared as though the lengthy wait was all for naught.

The experience was sufficiently poor for me to report it to Microsoft through one of the feedback channels. I was eventually contacted by helpful folks from the Visual Studio Debugger team and we analyzed the problem in an e-mail exchange. The performance issue is the result of problematic contents of the symbol path I configured for the debugger.

As I mentioned, I’m evaluating the beta on a Windows 7 machine. For this reason, the Windows 7 symbol packages, available from Microsoft’s public symbols download page, are deployed in my system and are a part of the symbol path defined by the _NT_SYMBOL_PATH environment variable. Since this is an x64 machine and I find myself debugging 32-bit processes quite often, both the 32-bit and 64-bit symbol packages are installed. My initial symbol path was:

C:\Symbols;C:\Symbols32;CACHE*C:\websymbols;SRV*http://msdl.microsoft.com/download/symbols

The first issue with this symbol path is that the x64 symbols package (extracted to C:\Symbols) and the x86 symbols package (extracted to C:\Symbols32) are specified as directories in the symbol path, rather than symbol stores. This is what you’d expect from symbol packages designed for local deployment, but it turns out that the Windows 7 symbol packages, unlike the PDB packages for previous versions of Windows, come in the symbol store directory layout rather than the flat directory layout. This means, for example, that the symbols for ntdll.dll are in a path like C:\Symbols\ntdll.pdb\CFF40300FD804691B73E12CF2A150EE02\ntdll.pdb rather than the simpler C:\Symbols\dll\ntdll.pdb.

I did not notice this issue, however, before installing Visual Studio 2010 Beta 2, because apparently Windbg doesn’t mind when stores are specified in this syntax, as it exercises some sort of heuristic to determine the layout of a symbol directory. However, Visual Studio 2010 Beta 2 is not as liberal. Examining its I/Os with Sysinternals Process Monitor determined that it wasn’t trying to find PDBs under the symbol directories except directly under them or in a “dll” subdirectory, rather than looking for the appropriate hash as it would in a symbol store. The resolution for this issue is simple enough, refer to the Windows 7 symbol packages with SRV* syntax in the symbol path. Therefore, the symbol path is updated to something like:

SRV*C:\Symbols;SRV*C:\Symbols32;CACHE*C:\websymbols;SRV*http://msdl.microsoft.com/download/symbols

With this change in place, the Visual Studio 2010 Beta 2 debugger was able to pick up symbols for system DLLs from the local stores, and now the debugging session started instantly. But the question remained: even if the debugger didn’t know how to look for symbols under C:\Symbols and C:\Symbols32 when they were not specified with a srv* directive, why did it download symbols from the HTTP public store, only to end up starting with symbols not being loaded for any of the system DLLs in the debugged process?

To get to the bottom of this, the local symbol caches were removed from the symbol path. At this point, it was

CACHE*C:\websymbols;SRV*http://msdl.microsoft.com/download/symbols

Running the debugger with this stripped down symbol path reproduced the poor debugger startup experience and the worse issue of symbols being downloaded only to end up not being used. At this point, Sysinternals Process Monitor was used to examine the actions of the debugger. Two curious facts were revealed.

The first was that the Visual Studio debugger was literally examining a directory called “cache*C:\websymbols” under its path in a vain attempt to find symbols. Since the “cache*” string made it to a file open request, obviously the cache* directive in the _NT_SYMBOL_PATH variable was not being correctly parsed or understood by the debugger.

The result of this deficiency is that the Visual Studio debugger should be using some default local cache directory for the downloaded symbols, instead of the one explicitly specified by the cache* directive. Therefore, the same behavior would be expected with the following symbol path:

SRV*http://msdl.microsoft.com/download/symbols

And indeed, a quick check revealed that the same peculiarity reproduced with this symbol path setting: symbols were being downloaded from the HTTP server, but in the end of the process, symbols were not loaded for any of the system modules in the debugged process. Whatever the default directory is, attempts to download symbols there resulted in their loss into oblivion.

Specifying the local symbol cache using the SRV* directive is also possible. This is the legacy approach, before Windbg recommend using CACHE* instead. A symbol path of this form is

SRV*C:\websymbols*http://msdl.microsoft.com/download/symbols

With this symbol path in place, the Visual Studio 2010 Beta 2 debugger both downloaded symbols from the HTTP store and actually ended up using them. Specifying the cache directory with CACHE* or not at all triggered the bug, while specifying it the old fashioned way in the SRV* directive satisfied the debugger.

As a result, my guidance for the Visual Studio 2010 Beta 2 debugger users experiencing performance issues or other symbol problems that have no issue with their symbol path when used with Windbg is:

  1. For each directory in your _NT_SYMBOL_PATH, determine whether it is in the “flat” format or in symbol store format. Prefix symbol stores with SRV*, changing “C:\Symbols” to “SRV*C:\Symbols”. Windows 7 users in particular should be aware that the symbol packages for their platforms should be specified with SRV* syntax for Visual Studio 2010 Beta 2.
  2. Specify local caches directories for remote symbol stores (SMB or HTTP) directly in the SRV* directive, to have Visual Studio 2010 Beta 2 pick them up. It is OK to keep the CACHE* directive in your symbol path as well, but for the time being, Visual Studio 2010 Beta 2 does not seem to use it correctly.

The Visual Studio Debugger team is addressing issues revealed by the investigation of this behavior for the forthcoming RTM release of Visual Studio 2010. However, several significant deficiencies in debugger symbol support will not be addressed in the 2010 release. One is that symbol loading is done synchronously rather than asynchronously to the debugging session. The other being that unless manual symbol loading is used, no progress indication nor cancellation UI is presented as the symbols are being transferred from a remote store. Therefore, the perceived performance of symbol support in the debugger will leave something to be desired for the time being.

IDA v5.4 supports Windbg as a debugger backend

A new release of IDA, the Interactive Disassembler, has been recently released featuring new debugger integration capabilities. IDA’s existing built-in debugger often proved lackluster, but IDA’s static analysis and navigation features are, of course, unrivaled by anything else. I always wished IDA would address the weakness of its debugging features and now they have done so in the v5.4 release. The new version can drive a gdb debugging server (as embedded platforms often provide), a Bochs virtual machine (great for BIOS and boot loader debugging) and most importantly DbgEng, the Microsoft debugging engine used by Windbg. Since Windbg sessions often involve heavy use of PDBs, IDA v5.4 has improved its support for importing data from PDBs and now uses more of their embedded type information (previously the third party Determina PDB plugin attempted to improve IDA’s PDB support). To top things off, the Python plugin is now bundled with IDA, as well.

I haven’t had the chance to use the new version yet, but Hex Rays have a great demo video posted here. The only thing notable that appears missing is a nice UI for examining the stack trace, but if push comes to shove the Windbg command line can be used to invoke “k”, as demonstrated.

Windbg 6.10.3.233 released

Version 6.10.3.233 of the Debugging Tools for Windows package has been available for a few days now.

The big news on this one is lots of FireWire debugging changes. An updated host driver is provided and claims of greater reliability and controller compatibility are made. Additional changes include better WOW64 support, claims of performance improvements to the various debuggers in the package (it remains to be seen how significant those are), many improvements to the almighty !analyze extension command and extensive updates to the debugger documentation. Absent from this release is NT 4.0 support. I suppose not many will miss it at this point.

Developers using the debugger API may be interested that symbols for dbghelp.dll v6.10.3.233 are available from the public symbol store, while symbols for dbgeng.dll v6.10.3.233 are once again missing, a sorry tradition of making the debugger interface consumer’s life hard since the 6.8 debugger release, if I recall correctly.

Oops! Microsoft private symbols accidently leaked in Visual Studio 2010 CTP VM image

I downloaded Microsoft’s newly released Visual Studio 2010 CTP virtual machine disk image hoping for a few surprises, but I certainly didn’t expect this…

The Visual Studio 2010 CTP is a huge multi-gigabyte VM running Windows Server 2008. The first thing I did with it is start up Visual C++ 2010, create a Win32 console application and run it in the debugger. I looked at the stack trace and saw the following:

vc10app.exe!wmain(int argc=1, wchar_t * * argv=0x000d1470) Line 8
vc10app.exe!__tmainCRTStartup() Line 564 + 0x19 bytes
vc10app.exe!wmainCRTStartup() Line 392
kernel32.dll!BaseThreadInitThunk(unsigned long RunProcessInit=0, long (void *)* StartAddress=0x00000000, void * Argument=0x7ffdf000) Line 66 + 0x5 bytes
ntdll.dll!__RtlUserThreadStart(long (void *)* StartAddress=0x013b1073, void * Argument=0x7ffdf000) Line 2740
ntdll.dll!_RtlUserThreadStart(long (void *)* StartAddress=0x013b1073, void * Argument=0x7ffdf000) Line 2672 + 0xb bytes

vc10app is the name of my test console application. I go over stack traces on a daily basis so the special thing about this one immediately caught my attention. Notice that the wmain() function of my console application has full debugging information (as expected for something I wrote) and the parameter names argc and argv are visible in the stack trace. Under normal circumstances, only public debugging symbols are available for Microsoft OS components like kernel32 and ntdll. In this CTP VM, however, the StartAddress and Argument parameters were visible as well.

Public debugging symbols are stripped versions of the original private symbols generated by the build process. They do not contain parameter information and do not contain the names of local variables in functions. Note however that for C++ functions, name mangling results in parameter types being visible in public symbols as well. Normally, when running the debugger in a system configured to use Microsoft’s public symbols, names for internal functions are visible in stack traces, but the names of arguments and locals never are.

I opened the Modules tab of the Visual Studio debugger to determine where the debugger is picking up these symbols for kernel32 and ntdll. The debugger was using C:\ppa\symstore as the symbol store. I opened the C:\ppa directory and saw that a Visual Studio Profiler session for a matrix multiplication application was stored there.

Apparently someone with access to Microsoft’s internal symbol store ran a profiling session on this matrix multiplication application, perhaps to ensure profiling is functional on the CTP VM. The private symbols retrieved for the session were persisted in the CTP’s disk and made their way to the public release. To ensure my hypothesis was correct, I installed Windbg on the machine, opened ntdll.dll as a crash dump and loaded symbols from the store directory:

Microsoft (R) Windows Debugger Version 6.9.0003.113 X86
Copyright (c) Microsoft Corporation. All rights reserved.
Loading Dump File [C:\Windows\System32\ntdll.dll]
Symbol search path is: SRV*c:\ppa\symcache;.
Executable search path is:
ModLoad: 77ed0000 77ff7000 C:\Windows\System32\ntdll.dll
eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=00000000 edi=00000000
eip=77ed0000 esp=00000000 ebp=00000000 iopl=0 nv up di pl nz na po nc
cs=0000 ss=0000 ds=0000 es=0000 fs=0000 gs=0000 efl=00000000
ntdll!`string'
(ntdll+0x0):
77ed0000 4d dec ebp
0:000> .reload /f
.
Loading unloaded module list
0:000> lm
start end module name
77ed0000 77ff7000 ntdll (private pdb symbols) c:\ppa\symcache\ntdll.pdb\B958B2F91A5A46B889DAFAB4D140CF252\ntdll.pdb
0:000> x ntdll!RtlAllocateHeap
77f358a6 ntdll!RtlAllocateHeap (void *, unsigned long, unsigned long)
0:000> dv /f ntdll!RtlAllocateHeap
@ebp+0x08 HeapHandle
@ebp+0x0c Flags
@ebp+0x10 Size
@ebp+0x08 ExtraSize
@ebp-0x04 AllocationSize
@ebp-0x08 Interceptor
@ebp-0x58 ExceptionRecord

The private PDB for ntdll.dll found in this CTP VM image notes how HeapHandle, Flags, Size and ExtraSize are the parameter names for RtlAllocateHeap. Furthermore, AllocationSize, Interceptor and ExceptionRecord are used as local names in this API.

Private PDBs also feature source information. This is also visible in this case:

0:000> ln ntdll!RtlAllocateHeap
d:\rtm\base\ntos\rtl\heap.c(1508)
Source Depot: basedepot.sys-ntgroup.ntdev.microsoft.com:2003 //depot/longhorn_rtm/base/ntos/rtl/heap.c#1
(77f358a6) ntdll!RtlAllocateHeap | (77f35997) ntdll!RtlpLowFragHeapFree
Exact matches:
ntdll!RtlAllocateHeap (void *, unsigned long, unsigned long)

The PDB features references to the source file from the Windows source tree for RtlAllocateHeap and the other APIs. Additionally, it appears to contain a custom reference to Microsoft’s internal source control system, Source Depot, presumably to facilitate the debugger retrieving up to date sources automatically when those are not available locally.

It’s interesting how scattered bits of information in a debugging symbols file provide a fascinating insight into Windows. Hope you enjoyed the surprise as much as I did…

Debugging user-mode BootExecute native applications with kd

Debugging code executing during system startup always poses a unique challenge. One may need to debug a custom or built-in Windows service right from the start, when attaching to it after it has initialized proves insufficient or inappropriate. When developing a GINA hook or GINA stub, the need to debug the Winlogon process before the logon process is performed arises. The inability of the Visual Studio Debugger to be useful in these situations is one of the reasons people turn to Windbg.

For debugging Windows services or the Winlogon process during startup, Image File Execution Options provides a workable solution. As soon as a process of the name specified under the Image File Execution Options registry key is created, the debugger command-line specified in the Debugger value is executed in lieu of the original command-line, which is appended to the debugger command-line. The debugger started might be Visual Studio’s, if appropriate, an interactive Windbg in other cases or an NTSD remote debugging server when you will not or cannot do things like make the service process interactive.

For the vast majority of startup applications, the aforementioned technique is both quite sufficient and convenient. However, there is another, perhaps esoteric, category of startup processes. These run a very early stage of the boot process. They are the BootExecute applications.

BootExecute applications are started by the Session Manager (smss.exe) before invoking the “initial command” (Winlogon in XP) and before the various subsystems are started. As far as user-mode goes, it doesn’t get much earlier than this. Because of their early nature, a significant constraint is in place for BootExecute applications: they are native applications.

Do not confuse this usage of “native” with native code vs. .NET managed code. In this context, native means that only the Windows NT Native API, resident in ntdll.dll, is available. At this stage, the Win32 subsystem, composed of the kernel-mode win32k.sys component and the user-mode client/server runtime, CSRSS, have not yet been started by SMSS. Not even the Kernel32 library is usable by BootExecute applications.

What are these useful for? Those special tasks that must be performed before everything else has started in the system, yet remain in the domain of user-mode work. Consider these two typical examples:

  • AutoCheck, the BootExecute variant of the CHKDSK tool, used to examine the boot volume before it is locked and to fix critical file-system errors.
  • Sysinternals PageDefrag, a BootExecute utility that defragments the Paging File, registry hives and other files inaccessible to defragging by the normal Win32 Disk Defragmentation tool.

We can confirm that AutoCheck is indeed a native application by examining it with Visual C++’s DUMPBIN utility:

C:\WINDOWS\system32>dumpbin /headers autochk.exe
Microsoft (R) COFF/PE Dumper Version 9.00.21022.08
Copyright (C) Microsoft Corporation. All rights reserved.

Dump of file autochk.exe

PE signature found

File Type: EXECUTABLE IMAGE

FILE HEADER VALUES
14C machine (x86)
4 number of sections
48025203 time date stamp Sun Apr 13 21:33:39 2008
0 file pointer to symbol table
0 number of symbols
E0 size of optional header
10E characteristics
Executable
Line numbers stripped
Symbols stripped
32 bit word machine

OPTIONAL HEADER VALUES
10B magic # (PE32)
7.10 linker version
5B800 size of code
34200 size of initialized data
0 size of uninitialized data
D6B9 entry point (0100D6B9) _NtProcessStartupForGS@4
1000 base of code
5D000 base of data
1000000 image base (01000000 to 01091FFF)
1000 section alignment
200 file alignment
5.01 operating system version
5.01 image version
5.01 subsystem version
0 Win32 version
92000 size of image
400 size of headers
96A6F checksum
1 subsystem (Native)
0 DLL characteristics
40000 size of stack reserve
1000 size of stack commit
100000 size of heap reserve
1000 size of heap commit
0 loader flags
10 number of directories
... snipped ...

Notice the subsystem specified for AutoChk is the native subsystem. Notice further that the application’s entrypoint is NtProcessStartup (in its /GS compiler stack buffer overflow protection stub form).

As for PageDefrag, it takes advantage of the Session Manager running its BootExecute application before it has enabled use of the Paging File.

You may find reasons of your own to develop a BootExecute native application, or you may find yourself in a situation requiring debugging of an existing BootExecute application. For instance, you may wish to debug the interactions of AutoChk’s volume locking attempts with your file system filter driver.

Unfortunately, these native applications pose a special difficulty to the user-mode debugger. NTSD is a Win32 application and must be invoked only after the Win32 subsystem has been initialized. Therefore, invocation of NTSD for debugging BootExecute applications is out of the question. Indeed, it is quite likely the Image File Execution Options registry key is not even consulted for BootExecute invocations, as that would be quite pointless.

Theoretically, this problem could be addressed by the development of a native subsystem user mode debugger, in lieu of the Win32-based NTSD. Alex Ionescu, most recently contributing to the eagerly awaited 5th edition of the Windows Internals book, has discussed the specifics of the NT Native Debugging API (DbgUi, etc.) in a series of articles titled Windows Native Debugging Internals.

At the moment, however, I am unaware of any available native subsystem user mode debugger. Such a tool may or may not be available internally in Microsoft. Presumably the Windows developers would benefit from such functionality, but they might also be content with using the kernel debugger for those purposes.

Be that as it may, the rest of us must turn to the kernel debugger for resolution. The kernel debugger can be used for source-level debugging of user-mode applications, including native subsystem applications. The special difficulty with using it is getting to break in the right place at the right time. In lieu of a Image File Execution Options-style apparatus, an alternative approach is required.

When modifying the native BootExecute application in question is feasible, the simple approach of adding an invocation of ntdll’s DbgBreakPoint API to the top of the NtProcessStartup process entrypoint is probably the quickest way to get the desired effect. In the absence of a user-mode debugger, the debug break will make its way to the kernel debugger. The debugger will notice the presence of the user-mode module, load symbols and source and the usual debugger functions will be accessible. If source is not available, in many cases the image can be patched to contain either an invocation of DbgBreakPoint or just an inline INT 3, as appropriate.

Such an approach, however, may not be feasible at all times and has the significant disadvantage of making the modified native application hang when a kernel debugger is not attached to the system at boot. Ideally, we’d like to break at process startup without modifying the native application at all.

When using the user-mode debugger, “sxe ld” can break when user-mode modules are mapped by the loader, as documented in Controlling Exceptions and Events. Normally, the kernel debugger does not provide that capability. However, it turns out that it can do so, once appropriately configured.

Before booting with the kernel debugger, turn on the “Enable loading of kernel debugger symbols” Global Flag, using the GFlags utility bundled with the Debugging Tools for Windows:

C:\Program Files\Debugging Tools for Windows>gflags /r +ksl
Current Boot Registry Settings are: 00040000
ksl - Enable loading of kernel debugger symbols

Although the name and description of this Global Flag appear to have nothing to do with user-mode module load events in the kernel debugger, they acheive the desired effect. Once enabled, we can reboot with the kernel debugger attached and ask for the kernel debugger to break once the desired native application is mapped:

Microsoft (R) Windows Debugger Version 6.9.0003.113 X86
Copyright (c) Microsoft Corporation. All rights reserved.

Connected to Windows XP 2600 x86 compatible target, ptr64 FALSE
Kernel Debugger connection established. (Initial Breakpoint requested)
Symbol search path is: C:\WINDOWS\Symbols;SRV*E:\SymStore*http://referencesource.microsoft.com/symbols;SRV*E:\SymStore*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows XP Kernel Version 2600 MP (1 procs) Free x86 compatible
Built by: 2600.xpsp.080413-2111
Kernel base = 0x804dc000 PsLoadedModuleList = 0x805684c0
System Uptime: not available
Break instruction exception - code 80000003 (first chance)
... snipped breakpoint warning message ...
nt!RtlpBreakWithStatusInstruction:
804e7a42 cc int 3
kd> sxe ld:autochk
kd> g
nt!DebugService2+0x10:
8050ae56 cc int 3

Setting the kernel debugger to break on the load of the AutoChk native BootExecute application resulted in our desired break. Let us consider the context of this break:

0: kd> kb
ChildEBP RetAddr Args to Child
f738d9fc 8050b2f9 f738da40 f738da10 00000003 nt!DebugService2+0x10
f738da20 805c533a f738da40 01000000 82953020 nt!DbgLoadImageSymbols+0x42
f738da70 805c51f0 82ab9c28 01000000 82953020 nt!MiLoadUserSymbols+0x169
f738dab4 8058d013 82ab9c28 01000000 f738db5c nt!MiMapViewOfImageSection+0x4b6
f738db10 80504e27 00000004 82953110 f738db5c nt!MmMapViewOfSection+0x13c
f738db6c 80590520 e165ec14 00000000 e1412398 nt!MmInitializeProcessAddressSpace+0x33d
f738dcbc 8059082f 0015f870 001f0fff 0015f7d8 nt!PspCreateProcess+0x333
f738dd10 805b54b2 0015f870 001f0fff 0015f7d8 nt!NtCreateProcessEx+0x7e
f738dd3c 804e298f 0015f870 001f0fff 0015f7d8 nt!NtCreateProcess+0x3d
f738dd3c 7c90e4f4 0015f870 001f0fff 0015f7d8 nt!KiFastCallEntry+0xfc
WARNING: Frame IP not in any known module. Following frames may be wrong.
0015f830 00000000 00000000 00000000 00000000 0x7c90e4f4
0: kd> !process 0 0
**** NT ACTIVE PROCESS DUMP ****
PROCESS 82bc9830 SessionId: none Cid: 0004 Peb: 00000000 ParentCid: 0000
DirBase: 02f40000 ObjectTable: e1002e40 HandleCount: 46.
Image: System

PROCESS 82935128 SessionId: none Cid: 0218 Peb: 7ffd9000 ParentCid: 0004
DirBase: 0899b000 ObjectTable: e13fbc68 HandleCount: 7.
Image: smss.exe

Although AutoChk has been mapped into memory, the AutoChk process is still in the process of being created. Indeed, the AutoChk process is as of yet absent from the system process list displayed by the !process debugger extension command.

However, AutoChk’s pseudo-created state does not prevent us from taking this opportunity to set up a debug breakpoint at the top of user code:

1: kd> lm m autochk
start end module name
01000000 01092000 autochk (deferred)
1: kd> bp autochk!NtProcessStartup
1: kd> bl
0 e 0100dd3d 0001 (0001) autochk!NtProcessStartup

Beware that if you perform a symbol reload with the .reload command after the module load event for autochk has fired off, you may find that it has disappeared from the debugger’s loaded module list… Just make sure you set up your breakpoint immediately after the event break.

It is easy enough to set a breakpoint at the application’s NtProcessStartup entrypoint before the EPROCESS is available, but we may wish to to set early breakpoints in process context elsewhere. To that end, we may proceed to the return from the process creation API from the module load event break, until the process is listed in the system process list:

1: kd> k
ChildEBP RetAddr
f7b619fc 8050b2f9 nt!DebugService2+0x10
f7b61a20 805c533a nt!DbgLoadImageSymbols+0x42
f7b61a70 805c51f0 nt!MiLoadUserSymbols+0x169
f7b61ab4 8058d013 nt!MiMapViewOfImageSection+0x4b6
f7b61b10 80504e27 nt!MmMapViewOfSection+0x13c
f7b61b6c 80590520 nt!MmInitializeProcessAddressSpace+0x33d
f7b61cbc 8059082f nt!PspCreateProcess+0x333
f7b61d10 805b54b2 nt!NtCreateProcessEx+0x7e
f7b61d3c 804e298f nt!NtCreateProcess+0x3d
f7b61d3c 7c90e4f4 nt!KiFastCallEntry+0xfc
0015f830 00000000 ntdll!KiFastSystemCallRet
1: kd> gu; gu; gu; gu; gu; gu; gu
nt!NtCreateProcessEx+0x7e:
8059082f e87a76f5ff call nt!_SEH_epilog (804e7eae)
1: kd> !process 0 0
**** NT ACTIVE PROCESS DUMP ****
PROCESS 82bc9830 SessionId: none Cid: 0004 Peb: 00000000 ParentCid: 0000
DirBase: 02f40000 ObjectTable: e1002e40 HandleCount: 46.
Image: System

PROCESS 829b4128 SessionId: none Cid: 0228 Peb: 7ffd7000 ParentCid: 0004
DirBase: 08bbb000 ObjectTable: e1468f58 HandleCount: 8.
Image: smss.exe

PROCESS 8294a3d0 SessionId: none Cid: 0238 Peb: 7ffd6000 ParentCid: 0228
DirBase: 08d40000 ObjectTable: e13fd408 HandleCount: 0.
Image: autochk.exe

By examining our location in the call stack after the module load event fires, we can see that returning from the Process Manager’s process creation routine PspCreateProcess would require going up 7 times. With that routine’s execution completed, the EPROCESS for autochk is now listed in the system process list and its value can be used as a context parameter for breakpoint commands, etc.

With the breakpoint on the native entrypoint in place, we can resume system execution and have the kernel debugger land right where we want it:

1: kd> g
Breakpoint 0 hit
autochk!NtProcessStartup:
001b:0100dd3d 8bff mov edi,edi
1: kd> kb
ChildEBP RetAddr Args to Child
0006fff4 00000000 7ffde000 000000c8 0000010a autochk!NtProcessStartup
1: kd> .process
Implicit process is now 82935020
1: kd> .thread
Implicit thread is now 8293f020

From this point, convenient source debugging of the native application is also possible if it’s your own custom written application. The various features such as Locals, Watches, single stepping, etc., work as expected. Some quirks of kernel debugging of a user process should be taken into consideration (make sure breakpoints have an EPROCESS and ETHREAD context specified when appropriate to avoid venturing into other processes by accident, etc.) and the inaccessibility of some user-mode debugger extension commands may prove inconvenient.

Sure beats DbgPrints, though!

Replacing boot load drivers with the Windows Boot Debugger

Recently, I’ve been assigned to work on fixing several bugs in a Windows file system filter driver. Debugging native code has always been characterized by the tedious and cumbersome modify, compile and link, copy, run, repeat… cycle, but in the case of kernel-mode development, the overhead of that cycle is even more acute.

I’ve found that booting the target system or virtual machine every time you want to replace a driver file with an updated build and then rebooting to have the new driver loaded significantly prolongs the cycle. Therefore, I was happy to discover Windbg’s .kdfiles command.

The .kdfiles command configure’s the kernel debugger’s driver replacement map. Whenever the NT Memory Manager attempts to load a driver image, it consults the kernel debugger, if attached, asking it for an alternative driver image. If the debugger has one, it is transmitted over the kernel debugging connection from the host to the target, and used in lieu of the target’s local driver image.

Using the driver replacement map makes it easier to replace a driver with an updated version. However, in its usual form, the replacement map feature has a significant limitation – it cannot replace boot load drivers.

To understand the logic behind this restriction, one must consider the nature of boot driver loading. While demand-start drivers are started by the user-mode Service Control Manager (SCM) and system-start drivers are loaded by NTOSKRNL’s IoInitSystem function, boot drivers are, as their name suggests, required for the system to boot and are therefore loaded by osloader, a part of ntldr (this description is for pre-Vista systems).

By the time the NT kernel is up and its Memory Manager consults the kernel debugger and its driver replacement map, it is far too late to do anything about those drivers which have been pre-loaded by the OS loader. The initial breakpoint offered by the kernel debugger is simply too late.

Fortunately, Microsoft recognized the importance of providing a driver replacement map for boot load drivers and provides a somewhat esoteric solution in the form of the debug version of NTLDR.

The debug version of NTLDR expects the kernel debugger to attach to it during system startup. Unlike the kernel debugger, it is not configured with the boot.ini file and is always configured to a 115,200 baud connection on the COM1 serial port.

The documentation for .kdfiles points out that the Windows Driver Kit (WDK) bundles a debug version of NTLDR in the debug subdirectory. However, such a file is nowhere to be found there, probably because the WDK now contains the Vista checked kernel in its debug directory and the modern Vista boot loader is distinct from NTLDR. More on Windows Vista later, but for now let’s concentrate on Windows XP.

Failing to locate the debug NTLDR in the WDK, I turned back in time to the Windows Server 2003 SP1 IFS Kit, a variant of the Windows Server 2003 SP1 DDK for file system and file system filter developers. I was glad to find the ntldr_dbg file in its debug subdirectory.

However, my happiness quickly turned to disappointment when I replaced the original NTLDR with ntldr_dbg in a Windows XP virtual machine. The system refused to boot, claiming that NTLDR was corrupt. Since the debug directory in the IFS kit contains checked kernel binaries for Windows Server 2003 SP1, I figured that the provided version of ntldr_dbg is a match for that version, as well.

I turned to the archives, so to speak, and dusted off old MSDN Subscription CDs. I eventually turned up the rather antiquated Windows XP SP1 DDK. In there, I found another version of ntldr_dbg. I placed it as required and this time the system booted successfully.

It is unfortunate that one has to dig up the DDK of yore to locate the boot debugger. It really ought to be more accessible.

With the debug version of NTLDR is in place, when you boot the system, right before the OS loader menu appears, you see the following message:
Boot Debugger Using: COM1 (Baud Rate 115200)

Once the message is displayed, NTLDR blocks waiting for a kernel debugger to connect. I start the kernel debugger the way I’d usually start it:
windbg -b -k com:pipe,port=\\.\pipe\com_1

Soon enough, however, it is evident that this is no ordinary kernel debugging session:

Microsoft (R) Windows Debugger Version 6.9.0003.113 X86
Copyright (c) Microsoft Corporation. All rights reserved.


Opened \\.\pipe\com_1
Waiting to reconnect...
BD: Boot Debugger Initialized
Connected to Windows Boot Debugger 2600 x86 compatible target, ptr64 FALSE
Kernel Debugger connection established. (Initial Breakpoint requested)
Symbol search path is: C:\WINDOWS\Symbols;SRV*E:\SymStore*http://referencesource.microsoft.com/symbols;SRV*E:\SymStore*http://msdl.microsoft.com/download/symbols
Executable search path is:
Module List address is NULL - debugger not initialized properly.
WARNING: .reload failed, module list may be incomplete
KdDebuggerData.KernBase < SystemRangeStart
Windows Boot Debugger Kernel Version 2600 UP Checked x86 compatible
Primary image base = 0x00000000 Loaded module list = 0x00000000
System Uptime: not available
Break instruction exception - code 80000003 (first chance)
0041cf70 cc int 3
kd>

Windbg has attached to the Windows Boot Debugger, a debugging environment provided by the debug version of NTLDR at a very early stage of system startup, well before the NT kernel has been loaded. Indeed, the initial breakpoint at the boot debugger occurs before an OS to start has been selected at the loader boot menu.

With the boot debugger at its initial breakpoint, we can set up the driver replacement map as desired. For instance, we can replace NTFS and NDIS with their counterparts from the checked build of Windows XP:

kd> .kdfiles -m \WINDOWS\system32\drivers\Ntfs.sys C:\Stuff\xpsp3checked\Ntfs.sys
Added mapping for '\WINDOWS\system32\drivers\Ntfs.sys'
kd> .kdfiles -m \WINDOWS\system32\drivers\Ndis.sys C:\Stuff\xpsp3checked\Ndis.sys
Added mapping for '\WINDOWS\system32\drivers\Ndis.sys'
kd> g
BD: osloader.exe base address 00400000
BD: \WINDOWS\system32\NTKRNLMP.CHK base address 80A02000
BD: \WINDOWS\system32\HALMACPI.CHK base address 80100000
BD: \WINDOWS\system32\KDCOM.DLL base address 80010000
BD: \WINDOWS\system32\BOOTVID.dll base address 80001000
BD: \WINDOWS\system32\DRIVERS\ACPI.sys base address 8014C000
BD: \WINDOWS\system32\DRIVERS\WMILIB.SYS base address 80007000
BD: \WINDOWS\system32\DRIVERS\pci.sys base address 80062000
BD: \WINDOWS\system32\DRIVERS\isapnp.sys base address 80012000
BD: \WINDOWS\system32\DRIVERS\compbatt.sys base address 80009000
BD: \WINDOWS\system32\DRIVERS\BATTC.SYS base address 8000C000
BD: \WINDOWS\system32\DRIVERS\intelide.sys base address 8001C000
BD: \WINDOWS\system32\DRIVERS\PCIIDEX.SYS base address 8017A000
BD: \WINDOWS\System32\Drivers\MountMgr.sys base address 80181000
BD: \WINDOWS\system32\DRIVERS\ftdisk.sys base address 8018C000
BD: \WINDOWS\System32\drivers\dmload.sys base address 8001E000
BD: \WINDOWS\System32\drivers\dmio.sys base address 801AB000
BD: \WINDOWS\System32\Drivers\PartMgr.sys base address 801D1000
BD: \WINDOWS\System32\Drivers\VolSnap.sys base address 801D6000
BD: \WINDOWS\system32\DRIVERS\atapi.sys base address 801E3000
BD: \WINDOWS\system32\DRIVERS\vmscsi.sys base address 80073000
BD: \WINDOWS\system32\DRIVERS\SCSIPORT.SYS base address 801FB000
BD: \WINDOWS\system32\DRIVERS\disk.sys base address 80213000
BD: \WINDOWS\system32\DRIVERS\CLASSPNP.SYS base address 8021C000
BD: \WINDOWS\system32\drivers\fltmgr.sys base address 80229000
BD: \WINDOWS\system32\DRIVERS\sr.sys base address 802A7000
BD: \WINDOWS\System32\Drivers\KSecDD.sys base address 802B9000
KD: Accessing 'C:\Stuff\xpsp3checked\Ntfs.sys' (\WINDOWS\System32\Drivers\Ntfs.sys)
File size 814K.... ....BD: Loaded remote file \WINDOWS\System32\Drivers\Ntfs.sys

BlLoadImageEx: Pulled \WINDOWS\System32\Drivers\Ntfs.sys from Kernel Debugger
BD: \WINDOWS\System32\Drivers\Ntfs.sys base address 802D0000
KD: Accessing 'C:\Stuff\xpsp3checked\Ndis.sys' (\WINDOWS\System32\Drivers\NDIS.sys)
File size 424K.... ....BD: Loaded remote file \WINDOWS\System32\Drivers\NDIS.sys

BlLoadImageEx: Pulled \WINDOWS\System32\Drivers\NDIS.sys from Kernel Debugger
BD: \WINDOWS\System32\Drivers\NDIS.sys base address 804DC000
Shutdown occurred...unloading all symbol tables.
Waiting to reconnect.

We can see that the boot debugger picked up our driver replacements and transferred them from the host to the target through the kernel debugger connection. Alas, this can be a lengthy process for an obese driver over the 115,200 baud link…

Beyond being useful for replacing your own drivers, which is what I had in mind when I looked into this feature, the boot debugger can be used to easily go back and forth between Windows free build and checked build operating system components, as illustrated above. However, such use is not without its problems.

For one, replacing the kernel and the HAL with their checked counterparts through the driver replacement map does not work. An error citing kernel corruption results from such an attempt. The traditional way of using a checked kernel, by placing an appropriate entry in boot.ini, is still required.

When testing a file system filter driver, apart from using the checked version of the I/O Manager through the use of the checked NT kernel, it is advantageous to use checked versions of underlying file system drivers such as NTFS. The checked versions can assert when you pass on requests to them in a way which violates the file system’s locking hierarchy and which may lead to deadlocks. Replacing the NTFS driver with the driver replacement map feature worked as expected, apart from causing NDIS to bugcheck during system boot with some sort of paging error. The issue was resolved by replacing NDIS with its checked counterpart through the driver replacement map, as well.

However, for a reason I do not understand, when placing the checked build of the Filter Manager, useful for debugging file system minifilters, there was no such luck. The boot loader complained after transferring the checked Filter Manager that the NTFS driver was corrupt. I disabled System File Protection and replaced the free drivers with the checked drivers on disk, the traditional way and the system booted with the checked NTFS and Filter Manager successfully. So it appears that the boot-time driver replacement map feature can be a bit flaky…

It is probably best to place checked operating system components the traditional way and only replace your own, frequently modified drivers with the boot debugger and the driver replacement map.

So much for Windows XP and the legacy NTLDR. But what about Windows Vista?

At first, the situation looked promising. In Windows Vista, the boot debugger is built-in. It can, for instance, be enabled for an existing boot entry with the Boot Configuration Database editor from an elevated command prompt:

C:\Windows\system32>bcdedit /enum

Windows Boot Manager
--------------------
identifier {bootmgr}
device partition=C:
description Windows Boot Manager
locale en-US
inherit {globalsettings}
default {current}
displayorder {current}
{5761b19a-1e8a-11dd-bcd4-000c29797dc6}
toolsdisplayorder {memdiag}
timeout 30

Windows Boot Loader
-------------------
identifier {current}
device partition=C:
path \Windows\system32\winload.exe
description Microsoft Windows Vista
locale en-US
inherit {bootloadersettings}
osdevice partition=C:
systemroot \Windows
resumeobject {694d30db-e737-11dc-814f-e01223f3682a}
nx OptIn

Windows Boot Loader
-------------------
identifier {5761b19a-1e8a-11dd-bcd4-000c29797dc6}
device partition=C:
path \Windows\system32\winload.exe
description Debugging
locale en-US
inherit {bootloadersettings}
osdevice partition=C:
systemroot \Windows
resumeobject {694d30db-e737-11dc-814f-e01223f3682a}
nx OptIn
debug Yes

C:\Windows\system32>bcdedit /bootdebug {5761b19a-1e8a-11dd-bcd4-000c29797dc6} ON

The operation completed successfully.

Unlike the XP boot debugger, the Vista boot debugger is set for a specific boot loader menu entry. Once we reboot and pick the entry for which boot debugging is enabled, we can attach:

Microsoft (R) Windows Debugger Version 6.9.0003.113 X86
Copyright (c) Microsoft Corporation. All rights reserved.

Opened \\.\pipe\com_1
Waiting to reconnect...
BD: Boot Debugger Initialized
Connected to Windows Boot Debugger 6001 x86 compatible target, ptr64 FALSE
Kernel Debugger connection established. (Initial Breakpoint requested)
Symbol search path is: C:\WINDOWS\Symbols;SRV*E:\SymStore*http://referencesource.microsoft.com/symbols;SRV*E:\SymStore*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows Boot Debugger Kernel Version 6001 UP Free x86 compatible
Primary image base = 0x00584000 Loaded module list = 0x00684e78
System Uptime: not available
Break instruction exception - code 80000003 (first chance)
winload!RtlpBreakWithStatusInstruction:
005bce88 cc int 3
kd> k
ChildEBP RetAddr
00120c6c 005b0862 winload!RtlpBreakWithStatusInstruction
00120e84 005b0760 winload!vDbgPrintExWithPrefixInternal+0x100
00120e94 0058bdaf winload!DbgPrint+0x11
00120eb0 0058bf6d winload!BlBdStart+0x81
00120f48 005a2f88 winload!BlBdInitialize+0x172
00120f64 005a28c2 winload!InitializeLibrary+0x168
00120f7c 0058513a winload!BlInitializeLibrary+0x42
00120fe8 0044646a winload!OslMain+0x13a
WARNING: Frame IP not in any known module. Following frames may be wrong.
00000000 f000ff53 0x44646a
00000000 00000000 0xf000ff53

We can see that in Vista, the boot debugger’s initial break is in the new winload.exe, replacing the osloader.exe embedded in ntldr of yesteryear. At this point the boot load drivers have yet to be loaded, so it would be perfect to set the .kdfiles driver replacement map at this point.

Alas, no such luck. It turns out the boot load driver replacement map feature is MIA in Windows Vista. This is confirmed by Microsoft’s Doron Holan in a reply to a post (free registration required) in OSR’s WINDBG mailing list. It is unclear what is the point of bundling the boot debugger with the regular operating system, unlike in the case of the hard to find ntldr_dbg for XP, only for it to be completely useless… Anyone using the boot debugger for purposes other than boot load driver replacement is probably working for Microsoft, so why should the boot debugger be a part of the OS if it is now missing what seems to be its most important functionality?

Hopefully the boot load driver replacement map will make a comeback in the Windows 7 boot debugger…

Remote Procedure Call debugging

Recently, I discussed how one would go about finding the other end of an LPC (Local inter-Process Communication, rather than Local Procedure Call, apparently) port. LPC is used directly through the native API for some Windows components such as LSA, but is more frequently used by third parties in the form of the “ncalrpc” RPC transport. When dealing with those cases, or cases where the higher level RPC runtime is used in general (e.g., with the named pipes or TCP transports), we must turn to a whole other family of techniques.

While in the case of LPC analysis we turned to the aid of the kernel debugger, in the case of RPC we can utilize built-in instrumentation found in the Windows RPC runtime library. Since RPC debugging may come to involve a variety of distributed scenarios, rather than opting for a plain registry setting enabling instrumentation, Microsoft chose to provide control through the group policy facility.

Enabling debugging aid by the runtime is prerequisite to any useful analysis work. Follow the instructions in the MSDN page “Enabling RPC State Information” and restart the system. Usually you’ll be able to make do with the “Server” setting.

For illustration purposes, we shall consider the HELLO RPC sample available with the Microsoft Windows SDK. The HELLO sample includes an IDL file specifying a trivial illustrative interface providing the HelloProc remote call that passes a string to the server side and the Shutdown remote call that instructs the server to shut down. Let’s run the HELLO server process.

In order to diagnose a product using RPC we must figure out the server endpoint of interest. Our primary tool will be the “dbgrpc” utility distributed with the Debugging Tools for Windows. With RPC state information enabled, we begin by enumerating RPC endpoints:

C:\Program Files\Debugging Tools for Windows>dbgrpc -e
Searching for endpoint info ...
PID CELL ID ST PROTSEQ ENDPOINT
-------------------------------------------------------------
0274 0000.0001 01 LRPC IUserProfile
0274 0000.0003 01 LRPC sclogonrpc
0274 0000.0005 01 NMP \PIPE\InitShutdown
0274 0000.0007 01 NMP \PIPE\SfcApi
0274 0000.000a 01 NMP \pipe\winlogonrpc
0274 0000.000e 01 LRPC OLEFEB89B1D900E460783A2A6ABA
02a0 0000.0001 01 LRPC ntsvcs
02a0 0000.0003 01 NMP \pipe\ntsvcs
02a0 0000.0006 01 NMP \PIPE\scerpc
02ac 0000.0001 01 NMP \PIPE\lsass
02ac 0000.0003 01 LRPC audit
02ac 0000.0005 01 LRPC securityevent
02ac 0000.0007 01 LRPC protected_storage
02ac 0000.0009 01 NMP \PIPE\protected_storage
034c 0000.0001 01 LRPC actkernel
034c 0000.0005 01 LRPC IcaApi
034c 0000.0007 01 NMP \pipe\Ctx_WinStation_API_ser
03a4 0000.0001 01 LRPC epmapper
03a4 0000.0003 01 TCP 135
03a4 0000.000a 01 NMP \pipe\epmapper
0414 0000.0001 01 LRPC dhcpcsvc
0414 0000.0003 01 LRPC wzcsvc
0414 0000.0005 01 LRPC OLEA390A47C8A6F4EA78EA712E62
0414 0000.0009 01 NMP \PIPE\atsvc
0414 0000.000e 01 LRPC AudioSrv
0414 0000.0010 01 NMP \PIPE\wkssvc
0414 0000.0011 01 NMP \pipe\keysvc
0414 0000.0012 01 LRPC keysvc
0414 0000.0014 01 LRPC SECLOGON
0414 0000.0016 01 NMP \pipe\trkwks
0414 0000.0017 01 LRPC trkwks
0414 0000.001a 01 NMP \PIPE\srvsvc
0414 0000.001d 01 LRPC srrpc
0414 0000.001f 01 LRPC senssvc
0414 0000.0021 01 NMP \PIPE\W32TIME
04ec 0000.0001 01 LRPC DNSResolver
0548 0000.0001 01 NMP \PIPE\DAV RPC SERVICE
0548 0000.0003 01 NMP \PIPE\winreg
0548 0000.0004 01 LRPC LRPC00000548.00000001
05e4 0000.0001 01 NMP \pipe\spoolss
05e4 0000.0003 01 LRPC spoolss
05e4 0000.0006 01 LRPC OLE8BC761BE0AFF4D9CA9603B53B
0684 0000.0001 01 LRPC OLE872E70B024824F8894A85E384
00ac 0000.0001 01 LRPC OLEAA4283CA4B51483E95665C439
0204 0000.0001 01 LRPC OLEDBAAFA32AEBF41AD808B50A1B
0594 0000.0001 01 LRPC OLEA0D6A971EC424B7DB839E9308
0314 0000.0001 01 LRPC hello

Endpoint enumeration gives you an idea of available RPC services in a server system. Since the HELLO server process was the last one launched, it is conveniently found at the bottom of the output.

Without repeating too much of the RPC debugging primer in the Windbg documentation, I’ll just point out the important fact that RPC state information is organized into “cells” in each process. Through the use of a simple endpoint enumeration command, we’ve already concluded that the HELLO server process is PID 0x314. Not an impressive feat for a process we just launched, but consider that this could easily be a third-party RPC server started as a service or on demand in an unknown executable.

Most of the time, we can associate the endpoint name with the application of interest since a descriptive string is being used. However, in other cases, we may know the server application of interest, but the endpoint name is unknown, random or auto-generated. When there’s just one endpoint, we can just find the process of interest in the dbgrpc endpoint enumeration output. In any case, we can examine the call used by the server application to the RPC runtime to determine which endpoint name is in use:


0:000> bp rpcrt4!RpcServerUseProtseqEpA
0:000> g
Breakpoint 0 hit
eax=00452000 ebx=7ffd5000 ecx=00452008 edx=00000014 esi=00d5f55c edi=7c911970
eip=77e97a0b esp=0012ff3c ebp=0012ff6c iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000206
RPCRT4!RpcServerUseProtseqEpA:
77e97a0b 8bff mov edi,edi
0:000> kb
ChildEBP RetAddr Args to Child
0012ff38 00401046 00452000 00000014 00452008 RPCRT4!RpcServerUseProtseqEpA
0012ff6c 00401e37 00000001 003330a0 00333120 hellos!main+0x46 [e:\projects\hello\hellos.c @ 21]
0012ffb8 00401d0f 0012fff0 7c816ff7 7c911970 hellos!__tmainCRTStartup+0x117 [f:\dd\vctools\crt_bld\self_x86\crt\src\crt0.c @ 266]
0012ffc0 7c816ff7 7c911970 00d5f55c 7ffd5000 hellos!mainCRTStartup+0xf [f:\dd\vctools\crt_bld\self_x86\crt\src\crt0.c @ 182]
0012fff0 00000000 00401d00 00000000 78746341 kernel32!BaseProcessStart+0x23

We note that the third argument to RpcServerUseProtseqEp specifies the server endpoint name:

0:000> da 00452008
00452008 "hello"

Note that more complex varieties of RPC servers may use alternative approaches for endpoint name selection that do not utilize the aforementioned API.

When debugging a remote call, finding the server-side in process resolution may prove to be insufficient. Fortunately, we can continue and extract thread information. Consider an endpoint list entry for a running HELLO server:

0314 0000.0001 01 LRPC hello

Let’s examine thread information for this RPC server process:

C:\Program Files\Debugging Tools for Windows>dbgrpc -t -P 314
Searching for thread info ...
PID  CELL ID   ST TID       ENDPOINT LASTTIME
---------------------------------------------
0314 0000.0002 03 000000f0 0000.0001 003ffcad

We can see that a thread associated with cell ID 2 is associated with the endpoint at cell ID 1. If this were a server process serving multiple endpoints, we’d be able to filter the threads of interest by ignoring those associated with other endpoints.

We can use the thread ID returned by dbgrpc to find the thread in the debugger:

C:\Program Files\Debugging Tools for Windows>cdb -p 0x314
Microsoft (R) Windows Debugger Version 6.9.0003.113 X86
Copyright (c) Microsoft Corporation. All rights reserved.
*** wait with pending attach
Symbol search path is: SRV*C:\websymbols*\\.host\Shared Folders\SymStore*http://
msdl.microsoft.com/download/symbols
Executable search path is:
ModLoad: 00400000 00455000 C:\Documents and Settings\AdminUser\Desktop\hellos.
exe
ModLoad: 7c900000 7c9b0000 C:\WINDOWS\system32\ntdll.dll
ModLoad: 7c800000 7c8f5000 C:\WINDOWS\system32\kernel32.dll
ModLoad: 77e70000 77f01000 C:\WINDOWS\system32\RPCRT4.dll
ModLoad: 77dd0000 77e6b000 C:\WINDOWS\system32\ADVAPI32.dll
(314.674): Break instruction exception - code 80000003 (first chance)
eax=7ffde000 ebx=00000001 ecx=00000002 edx=00000003 esi=00000004 edi=00000005
eip=7c901230 esp=0036ffcc ebp=0036fff4 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000246
ntdll!DbgBreakPoint:
7c901230 cc int 3
0:002> ~
0 Id: 314.534 Suspend: 1 Teb: 7ffdd000 Unfrozen
1 Id: 314.f0 Suspend: 1 Teb: 7ffdc000 Unfrozen
. 2 Id: 314.674 Suspend: 1 Teb: 7ffdb000 Unfrozen
0:002> ~1 s
eax=00350020 ebx=00000000 ecx=00144530 edx=ffffffff esi=00144878 edi=00144a80
eip=7c90eb94 esp=0055fe18 ebp=0055ff80 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!KiFastSystemCallRet:
7c90eb94 c3 ret
0:001> kb
ChildEBP RetAddr Args to Child
0055fe14 7c90e399 77e765d3 000007c8 0055ff74 ntdll!KiFastSystemCallRet
0055fe18 77e765d3 000007c8 0055ff74 00000000 ntdll!NtReplyWaitReceivePortEx+0xc
0055ff80 77e76c9f 0055ffa8 77e76ac1 00144878 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCa
lls+0x12a
0055ff88 77e76ac1 00144878 7c90ee18 0012faf8 RPCRT4!RecvLotsaCallsWrapper+0xd
0055ffa8 77e76c87 00144218 0055ffec 7c80b6a3 RPCRT4!BaseCachedThreadRoutine+0x79
0055ffb4 7c80b6a3 00144a80 7c90ee18 0012faf8 RPCRT4!ThreadStartRoutine+0x1a
0055ffec 00000000 77e76c6d 00144a80 00000000 kernel32!BaseThreadStart+0x37
0:001>

Now, let’s add a breakpoint in the server-side implementation of the HelloProc remote call, run the HELLO client and see the context:

0:001> bp hellos!HelloProc
0:001> g
Breakpoint 0 hit
eax=004010f0 ebx=0055fd0c ecx=00000000 edx=00144c00 esi=0055f908 edi=0055f8e4
eip=004010f0 esp=0055f8e4 ebp=0055f8f8 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
hellos!HelloProc:
004010f0 55 push ebp
0:001> k
ChildEBP RetAddr
0055f8e0 77e799dc hellos!HelloProc
0055f8f8 77ef321a RPCRT4!Invoke+0x30
0055fcf4 77ef36ee RPCRT4!NdrStubCall2+0x297
0055fd10 77e794a5 RPCRT4!NdrServerCall2+0x19
0055fd44 77e7940a RPCRT4!DispatchToStubInC+0x38
0055fd98 77e79336 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x113
0055fdbc 77e7be3c RPCRT4!RPC_INTERFACE::DispatchToStub+0x84
0055fdf8 77e7bc99 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0x2db
0055fe1c 77e7bbdd RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0x16d
0055ff80 77e76c9f RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0x310
0055ff88 77e76ac1 RPCRT4!RecvLotsaCallsWrapper+0xd
0055ffa8 77e76c87 RPCRT4!BaseCachedThreadRoutine+0x79
0055ffb4 7c80b6a3 RPCRT4!ThreadStartRoutine+0x1a
0055ffec 00000000 kernel32!BaseThreadStart+0x37
0:001>

As expected, thread 1 is the one servicing the remote procedure call received at the endpoint. So even if we didn’t know the specific function being called on the server side, we could have followed the worker thread’s execution flow into the indirect call in NdrStubCall2 until arriving at the function of interest.

Another RPC behavior we can notice at this point is the spawning of an additional worker thread by the RPC runtime, since the current one is busy servicing the HelloProc call. While HelloProc is broken into, we note the dbgrpc thread list:

C:\Program Files\Debugging Tools for Windows>dbgrpc -t -P 0x314
Searching for thread info ...
PID CELL ID ST TID ENDPOINT LASTTIME
---------------------------------------------
0314 0000.0002 01 000000f0 0000.0001 0045c6f9
0314 0000.0003 03 00000218 0000.0001 0045c6f9

Notice how two threads are now associated with our endpoint. We can examine the new thread in the debugger:

0:001> ~
0 Id: 314.534 Suspend: 1 Teb: 7ffdd000 Unfrozen
. 1 Id: 314.f0 Suspend: 1 Teb: 7ffdc000 Unfrozen
2 Id: 314.218 Suspend: 1 Teb: 7ffdb000 Unfrozen
0:001> ~2 k
ChildEBP RetAddr
0065fe14 7c90e399 ntdll!KiFastSystemCallRet
0065fe18 77e765d3 ntdll!NtReplyWaitReceivePortEx+0xc
0065ff80 77e76c9f RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0x12a
0065ff88 77e76ac1 RPCRT4!RecvLotsaCallsWrapper+0xd
0065ffa8 77e76c87 RPCRT4!BaseCachedThreadRoutine+0x79
0065ffb4 7c80b6a3 RPCRT4!ThreadStartRoutine+0x1a
0065ffec 00000000 kernel32!BaseThreadStart+0x37

The stack trace is consistent with another RPC worker thread on the endpoint. It’s nice of the RPC runtime to provide these thread management services for us.

In a situation where a process has multiple RPC worker threads servicing an endpoint, it can be difficult to figure out which worker thread will pick up the call, unlike in the degenerate case discussed above. In the more complicated cases, we can utilize server call (“SCALL”) information provided by dbgrpc. With the server process at a break and the client process having performed a remote call, we enumerate the server’s calls:

C:\Program Files\Debugging Tools for Windows>dbgrpc -c -P 314
Searching for call info ...
PID CELL ID ST PNO IFSTART THRDCELL CALLFLAG CALLID LASTTIME CONN/CLN
----------------------------------------------------------------------------
0314 0000.0004 02 000 7a98c250 0000.0002 00000009 00000000 0045c6f9 05d8.00d0

This is pretty awesome. The listing notes that the SCALL has cell identifier 0.4. We can get a more verbose information view repeating the above:

C:\Program Files\Debugging Tools for Windows>dbgrpc -l -P 314 -L 0.4
Getting cell info ...
Call
Status: Dispatched
Procedure Number: 0
Interface UUID start (first DWORD only): 7A98C250
Call ID: 0x0 (0)
Servicing thread identifier: 0x0.2
Call Flags: cached, LRPC
Last update time (in seconds since boot):4572.921 (0x11DC.399)
Caller (PID/TID) is: 5d8.d0 (1496.208)

While we used endpoint enumeration and thread cell enumeration to find the server side, we can use SCALL enumeration to find our clients. Let’s see what’s going on at process 0x5d8 in thread d0:

C:\Program Files\Debugging Tools for Windows>cdb -p 0x5d8
Microsoft (R) Windows Debugger Version 6.9.0003.113 X86
Copyright (c) Microsoft Corporation. All rights reserved.
*** wait with pending attach
Symbol search path is: SRV*C:\websymbols*\\.host\Shared Folders\SymStore*http://
msdl.microsoft.com/download/symbols
Executable search path is:
ModLoad: 00400000 00455000 C:\Documents and Settings\AdminUser\Desktop\helloc.
exe
ModLoad: 7c900000 7c9b0000 C:\WINDOWS\system32\ntdll.dll
ModLoad: 7c800000 7c8f5000 C:\WINDOWS\system32\kernel32.dll
ModLoad: 77e70000 77f01000 C:\WINDOWS\system32\RPCRT4.dll
ModLoad: 77dd0000 77e6b000 C:\WINDOWS\system32\ADVAPI32.dll
(5d8.3a0): Break instruction exception - code 80000003 (first chance)
eax=7ffd7000 ebx=00000001 ecx=00000002 edx=00000003 esi=00000004 edi=00000005
eip=7c901230 esp=0035ffcc ebp=0035fff4 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000246
ntdll!DbgBreakPoint:
7c901230 cc int 3
0:001> ~
0 Id: 5d8.d0 Suspend: 1 Teb: 7ffdf000 Unfrozen
. 1 Id: 5d8.3a0 Suspend: 1 Teb: 7ffde000 Unfrozen
0:001> ~0 s
eax=77ea19bb ebx=00145618 ecx=00144a78 edx=00000000 esi=0012fb68 edi=0012fb3c
eip=7c90eb94 esp=0012fab4 ebp=0012fb00 iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000206
ntdll!KiFastSystemCallRet:
7c90eb94 c3 ret
0:000> kb
ChildEBP RetAddr Args to Child
0012fab0 7c90e3ed 77e7ca99 000007c4 00145820 ntdll!KiFastSystemCallRet
0012fab4 77e7ca99 000007c4 00145820 00145820 ntdll!ZwRequestWaitReplyPort+0xc
0012fb00 77e7a326 00145858 0012fb20 77e7a357 RPCRT4!LRPC_CCALL::SendReceive+0x22
8
0012fb0c 77e7a357 0012fb3c 00444290 0012ff18 RPCRT4!I_RpcSendReceive+0x24
0012fb20 77ef3675 0012fb68 00145871 08efa12c RPCRT4!NdrSendReceive+0x2b
*** WARNING: Unable to verify checksum for C:\Documents and Settings\AdminUser\D
esktop\helloc.exe
0012fefc 004011b6 00444290 00444246 0012ff18 RPCRT4!NdrClientCall2+0x222
0012ff10 004010c5 00452010 058dc64f 08efa12c helloc!HelloProc+0x16
0012ff6c 004020d7 00000001 00332fe0 00333050 helloc!main+0xc5
0012ffb8 00401faf 0012fff0 7c816ff7 08efa12c helloc!__tmainCRTStartup+0x117
0012ffc0 7c816ff7 08efa12c 01c8c807 7ffd7000 helloc!mainCRTStartup+0xf
0012fff0 00000000 00401fa0 00000000 78746341 kernel32!BaseProcessStart+0x23
0:000>

We can clearly see the HelloProc client-side stub invoking NdrClientCall2 to perform the remote procedure call to our server process.

Note that the SCALL information also includes beginning of the RPC interface GUID (IfStart) and the slot (ProcNum) in the interface being invoked (think of the RPC interface as a C++ vtable) – this can be useful if we are looking for the server side implementation of an unknown interface and multiple interfaces are being exported by the server process.

You can figure out more techniques for using dbgrpc and the Windbg RPC debugging extension by going over Windbg’s RPC debugging documentation. I found the need for the above primer since the documentation is not exactly organized in tutorial form and can be daunting for the uninitiated.

There is another RPC debugging trick up our sleeve. I shall make an exception of my usual habit and discuss the “other” debugger, Visual Studio’s. The Visual Studio Debugger has an extremely powerful feature, unfortunately missing from Windbg, for RPC and COM debugging. Take a look at the documentation for the Native Debugging options dialog for where to turn it on. It is available as far back as Visual C++ 6.0, though you probably want to use a modern version of the Visual C++ debugger that would be able to use modern PDB symbol files (VC++ 6.0 chokes on XP SP2’s newer PDBs, etc.)

With this debugger feature enabled, you just perform a usual Step Into on the client side call during the debugging session, and instead of being lead into the low-level marshaling code generated by MIDL for the interface in question, another session of the debugger is automagically attached to the server-side process and the server-side thread is broken into at the call site of the server-side function implementation (O… M… G…) – pretty neat, don’t you think? COM folks, take notice – this stuff even works with full-fledged COM objects.

Unfortunately, Microsoft had to blow it by severely crippling this amazing debugger feature in Windows Vista. As if more excuses to dislike it were required, the debugger will no longer automatically locate the server process and attach to it on that OS. You’ll have to preattach the debugger to the server process by hand and only then will the server call be broken into when appropriate. On Vista, you can use the dbgrpc techniques discussed above to figure out which server process you should attach the Visual C++ debugger to. I also noticed the lack of the wonderful auto-attach behavior in one of my debugging sessions on a XP x64 system, although this is not mentioned in the Visual Studio documentation. What a waste!

Now on to RPC debugging across machine boundaries. Obviously, the RPC runtime will not provide us with process and thread identifiers if the call has crossed a machine boundary. For exploring this scenario, we shall modify the HELLO sample to use the named pipes transport (ncacn_np) to the remote HELLO server.

With CCALL information enabled (i.e., “Full” rather than just “Server” state information) we can see where outgoing RPC calls are headed.  Unfortunately, on one hand as soon as the server side responds the call is completed and the CCALL entry is gone. On the other hand, if we set up a breakpoint on the client-side stub (e.g., helloc!HelloProc) the RPC runtime doesn’t even know yet a remote call is about to be made.

If we know which server the outgoing call is headed to, we can break the server-side and thus make the client-side call block while waiting for the server to respond. In this state, we can examine CCALL information. First let’s set up the break on the server:

C:\Program Files\Debugging Tools for Windows>cdb E:\Projects\hello\hellos.exe
Microsoft (R) Windows Debugger Version 6.9.0003.113 X86
Copyright (c) Microsoft Corporation. All rights reserved.
CommandLine: E:\Projects\hello\hellos.exe
Symbol search path is: C:\WINDOWS\Symbols;SRV*E:\SymStore*http://referencesource
.microsoft.com/symbols;SRV*E:\SymStore*http://msdl.microsoft.com/download/symbols
Executable search path is:
ModLoad: 00400000 00455000 hellos.exe
ModLoad: 7c900000 7c9af000 ntdll.dll
ModLoad: 7c800000 7c8f6000 C:\WINDOWS\system32\kernel32.dll
ModLoad: 77e70000 77f02000 C:\WINDOWS\system32\RPCRT4.dll
ModLoad: 77dd0000 77e6b000 C:\WINDOWS\system32\ADVAPI32.dll
ModLoad: 77fe0000 77ff1000 C:\WINDOWS\system32\Secur32.dll
(1320.1304): Break instruction exception - code 80000003 (first chance)
eax=00241eb4 ebx=7ffdf000 ecx=00000007 edx=00000080 esi=00241f48 edi=00241eb4
eip=7c90120e esp=0012fb20 ebp=0012fc94 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
ntdll!DbgBreakPoint:
7c90120e cc int 3
0:000> bp hellos!HelloProc
*** WARNING: Unable to verify checksum for hellos.exe
0:000> g

Then let’s have the call block on the client side by simply running the client. Now let’s examine the client call list:

C:\Program Files\Debugging Tools for Windows>dbgrpc -a
Searching for call info ...
PID CELL ID PNO IFSTART TIDNUMBER CALLID LASTTIME PS CLTNUMBER ENDPOINT
------------------------------------------------------------------------------
0428 0000.003f 0009 4b112204 0000.0000 ffffffff 00019238 09 0000.003d LRPC000004
e4
0710 0000.0001 0000 7a98c250 0000.0000 00000001 00843c7a 0f 0000.0002 \pipe\hell
o
C:\Program Files\Debugging Tools for Windows>dbgrpc -l -P 710 -L 0.1
Getting cell info ...
Client call info
Procedure number: 0
Interface UUID start (first DWORD only): 7A98C250
Call ID: 0x1 (1)
Calling thread identifier: 0x0.0
Call target identifier: 0x0.2
Call target endpoint: \pipe\hello
C:\Program Files\Debugging Tools for Windows>dbgrpc -l -P 710 -L 0.2
Getting cell info ...
Call target info
Protocol Sequence: NMP
Last update time (in seconds since boot):8666.234 (0x21DA.EA)
Target server is: darkstar
C:\Program Files\Debugging Tools for Windows>

Notice how the CCALL information cell is associated with a target information cell containing the name of the remote host servicing the call. If we were unsure which remote calls were being made, we could extract the actual interface calls from the CCALL information entry (alternatively, a network protocol analyzer understanding MSRPC, such as Wireshark or Microsoft Network Monitor, could be used).

Now let’s see how a call from a remote client appears on the server end. We’ll wait for our breakpoint on the server-side stub to fire. At this point we’d have a SCALL entry to consider:

0:000> g
Breakpoint 0 hit
eax=004010f0 ebx=0055fd54 ecx=00000000 edx=00145700 esi=0055f950 edi=0055f92c
eip=004010f0 esp=0055f92c ebp=0055f940 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
hellos!HelloProc:
004010f0 55 push ebp
0:001> |
. 0 id: 6dc create name: hellos.exe
C:\Program Files\Debugging Tools for Windows>dbgrpc -c -P 6dc
Searching for call info ...
PID CELL ID ST PNO IFSTART THRDCELL CALLFLAG CALLID LASTTIME CONN/CLN
----------------------------------------------------------------------------
06dc 0000.0004 02 000 7a98c250 0000.0002 00000001 00000001 009b0f9e 0000.0003

Notice how instead of the traditional process and thread identifiers, we have what appears to be a cell ID as the caller. Let’s see what information cells 3 and 4 contain:

C:\Program Files\Debugging Tools for Windows>dbgrpc -l -P 6dc -L 0.4
Getting cell info ...
Call
Status: Dispatched
Procedure Number: 0
Interface UUID start (first DWORD only): 7A98C250
Call ID: 0x1 (1)
Servicing thread identifier: 0x0.2
Call Flags: cached
Last update time (in seconds since boot):10162.78 (0x27B2.4E)
Owning connection identifier: 0x0.3
C:\Program Files\Debugging Tools for Windows>dbgrpc -l -P 6dc -L 0.3
Getting cell info ...
Connection
Connection flags: Exclusive
Authentication Level: Default
Authentication Service: None
Last Transmit Fragment Size: 49 (0x1002050)
Endpoint for the connection: 0x0.1
Last send time (in seconds since boot):10162.78 (0x27B2.4E)
Last receive time (in seconds since boot):10162.78 (0x27B2.4E)
Getting endpoint info ...
Process object for caller is 0xA14

Notice that the connection cell contains the remote PID of the caller, 0xA14.

0:001> |
. 0 id: a14 attach name: E:\Projects\hello\helloc.exe
0:001>

Unfortunately, the thread identifier is missing so you’ll have to use CCALL information on the client for that. Even more tragically, dbgrpc fails to name the name of the remote caller! You know it’s PID 0xA14, you just don’t know on what machine… You’ll have to make an educated guess, perhaps with the assistance of a network protocol analyzer.

Occasionally we won’t be in a situation that allows for breaking the server-side to facilitate blocking the client-side call for CCALL information examination. In such cases, we’ll want to break the client-side right after debug information for the call has been registered, but before the call has been sent to the server for completion. The various RPC transports utilize the CCALL::SetDebugClientCallInformation function for this purpose. Let’s see what happens when we break on it, let it do the registration and examine the CCALL table:

0:000> bp rpcrt4!CCALL::SetDebugClientCallInformation
0:000> g
Breakpoint 0 hit
eax=0012faa8 ebx=00000000 ecx=001450a8 edx=00000000 esi=001450a8 edi=0012fb3c
eip=77ec44de esp=0012fa68 ebp=0012fab8 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
RPCRT4!CCALL::SetDebugClientCallInformation:
77ec44de 8bff mov edi,edi
0:000> k
ChildEBP RetAddr
0012fa64 77ea7b73 RPCRT4!CCALL::SetDebugClientCallInformation
0012fab8 77e808d0 RPCRT4!OSF_CCALL::FastSendReceive+0x72
0012fad4 77e80e1f RPCRT4!OSF_CCALL::SendReceiveHelper+0x58
0012fb00 77e7a326 RPCRT4!OSF_CCALL::SendReceive+0x41
0012fb0c 77e7a357 RPCRT4!I_RpcSendReceive+0x24
0012fb20 77ef3675 RPCRT4!NdrSendReceive+0x2b
*** WARNING: Unable to verify checksum for helloc.exe
0012fefc 004011b6 RPCRT4!NdrClientCall2+0x222
0012ff10 004010c5 helloc!HelloProc+0x16
0012ff6c 004020d7 helloc!main+0xc5
0012ffb8 00401faf helloc!__tmainCRTStartup+0x117
0012ffc0 7c816ff7 helloc!mainCRTStartup+0xf
0012fff0 00000000 kernel32!BaseProcessStart+0x23
0:000> gu
eax=00000000 ebx=00000000 ecx=00000002 edx=0000b10e esi=001450a8 edi=0012fb3c
eip=77ea7b73 esp=0012fa88 ebp=0012fab8 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
RPCRT4!OSF_CCALL::FastSendReceive+0x72:
77ea7b73 3bc3 cmp eax,ebx
0:000>
C:\Program Files\Debugging Tools for Windows>dbgrpc -a
Searching for call info ...
PID CELL ID PNO IFSTART TIDNUMBER CALLID LASTTIME PS CLTNUMBER ENDPOINT
------------------------------------------------------------------------------
0428 0000.003f 0009 4b112204 0000.0000 ffffffff 00019238 09 0000.003d LRPC000004
e4
0504 0000.0001 0000 7a98c250 0000.0000 001440c8 00b10eb8 00 0000.0002

Oops… notice how the name of the endpoint is missing from the CCALL entry at this point! With some disassembly (left as an exercise for the reader) it is clear the caller copies the endpoint name into the debug information buffer right after setting up the entry:
0:000> bp rpcrt4!CCALL::SetDebugClientCallInformation
0:000> g
Breakpoint 0 hit
eax=0012faa8 ebx=00000000 ecx=001450a8 edx=00000000 esi=001450a8 edi=0012fb3c
eip=77ec44de esp=0012fa68 ebp=0012fab8 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
RPCRT4!CCALL::SetDebugClientCallInformation:
77ec44de 8bff mov edi,edi
0:000> gu
eax=00000000 ebx=00000000 ecx=00000002 edx=0000bb8b esi=001450a8 edi=0012fb3c
eip=77ea7b73 esp=0012fa88 ebp=0012fab8 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
RPCRT4!OSF_CCALL::FastSendReceive+0x72:
77ea7b73 3bc3 cmp eax,ebx
0:000> bp rpcrt4!strncpy
0:000> g
Breakpoint 1 hit
eax=00350034 ebx=00000001 ecx=0012fa79 edx=00000000 esi=001450a8 edi=0000000c
eip=77e952a0 esp=0012fa6c ebp=0012fab8 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
RPCRT4!strncpy:
77e952a0 ff252813e777 jmp dword ptr [RPCRT4!_imp__strncpy (77e71328)] ds:
0023:77e71328={ntdll!strncpy (7c902c80)}
0:000> t
eax=00350034 ebx=00000001 ecx=0012fa79 edx=00000000 esi=001450a8 edi=0000000c
eip=7c902c80 esp=0012fa6c ebp=0012fab8 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
ntdll!strncpy:
7c902c80 8b4c240c mov ecx,dword ptr [esp+0Ch] ss:0023:0012fa78=000000
0c
0:000> gu
eax=00350034 ebx=00000001 ecx=00000000 edx=006f6c6c esi=001450a8 edi=0000000c
eip=77ea7be8 esp=0012fa70 ebp=0012fab8 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
RPCRT4!OSF_CCALL::FastSendReceive+0xe7:
77ea7be8 83c40c add esp,0Ch
0:000>
C:\Program Files\Debugging Tools for Windows>dbgrpc -a
Searching for call info ...
PID CELL ID PNO IFSTART TIDNUMBER CALLID LASTTIME PS CLTNUMBER ENDPOINT
------------------------------------------------------------------------------
0428 0000.003f 0009 4b112204 0000.0000 ffffffff 00019238 09 0000.003d LRPC000004
e4
07b4 0000.0001 0000 7a98c250 0000.0000 00000001 00bb8b2b 00 0000.0002 \pipe\hello

Ahh, that’s better. But if we examine the server name in the CCALL cell, we see it hasn’t yet been initialized. We need another round of strncpy for that. If we dig further into the transport code, we figure out that it would be better to break right before the function call dispatching the data to the server side. For instance, in the case of the named pipe transport, this would be the call to RPCRT4!OSF_CCALL::SendNextFragment from RPCRT4!OSF_CCALL::FastSendReceive. If we are using the LPC transport instead, other transport functions will be involved. To summarize – breaking the call after CCALL information has been completely registered but before it has been sent to the server is not so easy and is highly transport dependent. However, it is indeed quite possible if your scenario requires it.

And so the RPC debugging primer comes to conclusion. It is a messy ordeal, yet so much cooler than stepping through yet another SOAP web service in Visual Studio, isn’t it? :-)

Windbg 6.9.3.113 released

A new version of the Debugging Tools for Windows appeared, quietly as usual, on Microsoft’s web site a few days ago.

Unfortunately the debugging symbols package for Windows XP SP3 is still MIA, presumably being delayed along with widespread SP3 availability on the Download Center and Windows Update. My local symbol store has grown quite obese with all the SP2 patches over the years so I’m looking forward to clean things up once that’s available.

Nothing too exciting in the RELNOTES.TXT for this release. Integrated managed debugging remains dysfunctional so the trusty 6.7.5.0 remains in place for that. Can’t even get SOS to break on an application’s Main method. The most exciting feature is enhancements to the “dt” command.

Yawn.

Deploying the Visual C++ libraries with an NSIS installer

Beginning with Visual C++ 2005 and continuing into Visual C++ 2008 and the foreseeable future, Microsoft’s various runtime libraries (CRT, ATL, MFC, etc.) are no longer installed into the system32 directory on Windows XP and later, but are rather “side-by-side assemblies” that need to be installed into the side-by-side store, “WinSxS”, in order to be available to all applications.

I’ve discussed the SxS store and the API Microsoft has documented for managing it in a previous post. Nevertheless, at the request of the NSIS maintainer, kichik, I’ll provide some guidance on the issue of runtime deployment and concrete examples to authors of NSIS-based installations. Do keep in mind that I am not adept at authoring NSIS installers and questions beyond the realm of the matter at hand are best targeted at the NSIS forum.

Unlike in the Linux world, where the C runtime library is considered an operating system component and versions of it are never installed by applications (at worst, some proprietary application is linked against an antique version of libc and requires the system administrator to install a compatibility package provided by the distribution), the CRT situation on Windows is more complicated. In the days of yore, Windows NT provided the now long defunct CRTDLL.DLL. Later, the newer variant MSVCRT.DLL shipped with Visual C++, going into the 6.0 release. However, in addition to serving as the runtime of a specific Visual C++ version, MSVCRT.DLL doubles as the “OS CRT”, the version of the C++ runtime deployed with the OS as far back as NT 4.0 and going into Windows Vista. Components included with the operating system itself, such as Notepad and Calculator, are linked against this CRT dynamically. Do not let the identical moniker fool you, the CRTs included with the various NT releases diverge significantly, sporting, for example, a brand new exception handling runtime in Windows Vista, aligned with newer Visual C++ compilers.

The existence of several MSVCRT.DLL variants and the associated versioning issues are probably what led Microsoft to adopt a policy of strongly versioned CRTs beginning with the Visual C++ .NET (2002) release. MSVCR70.DLL was the runtime required by the output of that product, and later versions would require deploying MSVCR71.DLL, MSVCR80.DLL and most recently MSVCR90.DLL. In addition to the CRT itself, there are also the various peripheral libraries that some applications may depend on, such as ATL and MFC.

I’ve discussed in the past an approach utilizing the Windows Driver Kit build environment that allows combining a modern C++ compiler with targeting the Visual C++ 6.0 / OS CRT, MSVCRT.DLL. It is for the brave who don’t mind getting their hands dirty and whose desire to target the broadly deployed runtime exceeds the fear of the plethora of potential version compatibility issues such an application configuration can cause.

For the more conservative lot, the question remains, how do I get the new C++ runtimes to my end-user’s machine? The first approach is that of utilizing static linking. It should be avoided at nearly all cost as it results in both obese executables that are unable to share the runtime’s memory pages with other running processes and is completely unservicable by Microsoft when a security update or another bug fix to all users of the runtime libraries needs to be broadly deployed.

We therefore turn our attention to approaches based on dynamic linking. First of all, the reader should review the official guidance provided by the Visual C++ team on the matter, although he or she may not like what they read. To summarize, Microsoft officially supports the following deployment methods:

  • Use an MSI-based (Windows Installer) installation procedure and utilize their MSM merge modules to include whatever runtime components you require with your application. The MSMs are black box magic that will get those runtime libraries into the “winsxs” store without asking too many questions. If you don’t like those massively complicated MSI installers and the WiX XML schemas make your head spin, that’s too bad.
  • Use the obese VCRedist.exe for the target architecture, without the benefit of picking and choosing only those runtime components that are of interest for your specific application.
  • Deploy the runtime libraries as files in your application’s directory, or “private assemblies” in SxS nomenclature, wasting the end-user’s hard disk space with multiple copies. This is not as bad as it seems, since at least SxS redirection policies can make an updated, security patched version from the “winsxs” store be loaded in place of out of date version deployed privately with the application, unlike with classic non-SxS local DLLs or with static linking.

As the popularity of NSIS as an installation apparatus shows, not everyone are willing to be strong-armed into an MSI-based installation just yet. So how do CRT deploying installers address this acute issue? I was disappointed, but not surprised, to see that VLC, DivX and various other applications with NSIS-based installers, opt for the “private assembly” approach, simplifying life for the installation author but needlessly wasting end-user disk real estate.

The now documented SxS API provides an alternative approach, presumably supported by Microsoft for deploying SxS assemblies in general (such as your own) but not specifically by the Visual C++ folks for theirs. The motivation for this lack of support is unclear, since the end result is as servicable by them as is using Windows Installer merge modules. Nevertheless, it is something that those who follow this path should be aware of.

OK, so let’s get on with it. Unlike with system32, we can’t just waltz into winsxs and drop our assembly’s files there. The directory structure is complicated, differs between XP and Vista, and in fact the ACL on the directory in Vista won’t allow anyone but TrustedInstaller (i.e., MSI) to touch it. Therefore we are required to perform the installation through the SxS API, which provides a COM-based interface for manipulating the store.

For illustration purposes, I shall use the Visual C++ 2005 (8.0) Debug CRT. Note that this is not the CRT you want to deploy to your end users, and in any case is explicitly NOT redistributable by Microsoft’s license terms. I use it for illustrative convenience since my XP virtual machine doesn’t have this assembly. We’ll use an NSIS installer script to drive the wonderful though peculiar System plug-in and get it to invoke the SxS API. Note that elaborate error handling is omitted for brevity. So here we go:


Name "NSIS SxS Test"
OutFile "nsissxs.exe"
SetPluginUnload alwaysoff
ShowInstDetails show
XPStyle on
SetCompressor /SOLID lzma
InstallDir $PROGRAMFILES\NSISSxS


!define FUSION_REFCOUNT_UNINSTALL_SUBKEY_GUID {8cedc215-ac4b-488b-93c0-a50a49cb2fb8}

Section "Uninstall"
DeleteRegKey "HKLM" "Software\Microsoft\Windows\CurrentVersion\Uninstall\nsissxs"
Delete $INSTDIR\uninst.exe
Delete $INSTDIR\dummy.txt
RMDir $INSTDIR
DetailPrint "Removing DebugCRT assembly..."
System::Call "sxs::CreateAssemblyCache(*i .r0, i 0) i.r1"
StrCmp $1 0 0 fail
System::Call "*(i 32, i 0, i 2364391957, i 1217113163, i 178634899, i 3090139977, w 'nsissxs', w '') i.s"
Pop $2
System::Call "$0->3(i 0, w 'Microsoft.VC80.DebugCRT,version=$\"8.0.50727.762$\",type=$\"win32$\",processorArchitecture=$\"x86$\",publicKeyToken=$\"1fc8b3b9a1e18e3b$\"', i r2, *i . r3) i.r1"
StrCmp $1 0 0 fail2
DetailPrint "Disposition returned is $3"
System::Call "$0->2()"
Goto end
fail:
DetailPrint "CreateAssemblyCache failed."
DetailPrint $1
Goto end
fail2:
DetailPrint "UninstallAssembly failed."
DetailPrint $1
Goto end
end:
SectionEnd

Section
SetOutPath $INSTDIR
File "dummy.txt"
WriteUninstaller $INSTDIR\uninst.exe
WriteRegStr "HKLM" "Software\Microsoft\Windows\CurrentVersion\Uninstall\nsissxs" "DisplayName" "NSIS SxS Test"
WriteRegStr "HKLM" "Software\Microsoft\Windows\CurrentVersion\Uninstall\nsissxs" "UninstallString" "$INSTDIR\uninst.exe"
InitPluginsDir
SetOutPath $PLUGINSDIR
File "msvcm80d.dll"
File "msvcp80d.dll"
File "msvcr80d.dll"
File "x86_Microsoft.VC80.DebugCRT_1fc8b3b9a1e18e3b_8.0.50727.762_x-ww_5490cd9f.cat"
File "x86_Microsoft.VC80.DebugCRT_1fc8b3b9a1e18e3b_8.0.50727.762_x-ww_5490cd9f.manifest"

DetailPrint "Installing DebugCRT assembly..."
System::Call "sxs::CreateAssemblyCache(*i .r0, i 0) i.r1"
StrCmp $1 0 0 fail
# Fill a FUSION_INSTALL_REFERENCE.
# fir.cbSize = sizeof(FUSION_INSTALL_REFERENCE) == 32
# fir.dwFlags = 0
# fir.guidScheme = FUSION_REFCOUNT_UNINSTALL_SUBKEY_GUID
# fir.szIdentifier = "nsissxs"
# fir.szNonCanonicalData = 0
System::Call "*(i 32, i 0, i 2364391957, i 1217113163, i 178634899, i 3090139977, w 'nsissxs', w '') i.s"
Pop $2
# IAssemblyCache::InstallAssembly(0, manifestPath, fir)
System::Call "$0->7(i 0, w '$PLUGINSDIR\x86_Microsoft.VC80.DebugCRT_1fc8b3b9a1e18e3b_8.0.50727.762_x-ww_5490cd9f.manifest', i r2) i.r1"
System::Free $2
StrCmp $1 0 0 fail2
System::Call "$0->2()"
Goto end
fail:
DetailPrint "CreateAssemblyCache failed."
DetailPrint $1
Goto end
fail2:
DetailPrint "InstallAssembly failed."
DetailPrint $1
Goto end
end:
SectionEnd

If you are not familiar with NSIS script syntax, now would be a good time to get acquainted. Let us review the contents of the 2nd section, which is the install section. The dummy file is a placeholder for the actual files your installer wants to deploy. Next, we set up an Uninstall entry in the registry as one usually would. Now on to the interesting part.

In order to deploy a SxS assembly, we must place its DLLs together in a temporary directory created by the installer. Note that if an assembly contains several DLLs, we cannot pick and choose only those that our application links with. The assembly is deployed, versioned and bound as a whole. We can figure out which files are part of the assembly in question by reviewing the assembly manifest, which we’ll find installed into the Manifests subdirectory of the WinSxS store on a Windows XP system. If we review the Debug CRT’s manifest, we can see <file> nodes under the <assembly> node, each referencing one of the files that must be deployed with the assembly. You can find the actual assembly files under the subdirectory with the assembly’s strong name in the WinSxS store.

In addition to the DLL files themselves, the assembly manifest and the assembly signing catalog are an integral part of the assembly. The catalog ensures the integrity of the assembly and is a welcome feature over traditional DLL deployment.

With the DLLs, assembly manifest and catalog in place, we are ready to invoke the SxS API for assembly installation. First, we call CreateAssemblyCache to retrieve the IAssemblyCache interface for managing the SxS store. Note that in the context of an NSIS installer, COM has already been initialized (for STA use) at this point, but if you are making a custom installer in another environment you may have to take care of that before reaching this point.

Assuming all goes well the next phase is setting up the FUSION_INSTALL_REFERENCE structure that will describe our assembly installation. Typically, you’ll want the reference to be associated with the registry Uninstall key for your application. Besides, other reference types do not seem to work too well and the documentation doesn’t err on the side of verbosity.

The not-so-seasoned NSIS scripter that I am, I couldn’t figure out a more legible way to specify the GUID argument to the InstallAssembly invocation so I broke down its components by hand. Counting vtable indices including the IUnknown and IAssemblyCache interfaces, InstallAssembly is at vtable slot 7. After the install reference structure is set up and the method invoked, we hope for the best.

Assuming a successful install transaction, we proceed to call the IUnknown method Release (vtable slot 2) to free the SxS cache manager and deem our install sequence completed.

We now turn our attention to the reverse sequence in the Uninstall section of the illustration installer. Being good citizens of the Windows ecosystem, we remove our reference to the shared assembly when the end-user removes our application from their system. WinSxS manages assembly reference counting and will figure out whether the assembly files should actually be removed from the disk.

We create an IAssemblyCache interface instance as before but this time call UninstallAssembly to remove our reference. This is the first method of the interface but is preceded by the IUnknown members and is thus at vtable slot 3. Following a successful invocation we can examine the returned Disposition value if it is of interest and proceed to free the instance.

Note that we remove an assembly by its full, strong name and not by path. You can figure out the assembly’s strong name from its manifest.

OK, installing the VC8 Debug CRT was easy enough. Note that other libraries (ATL, etc.) you’ll want to install may have dependencies on other assemblies, so make sure you get your install sequence in order.

Installing the Visual C++ 2005 runtime is nice and all, but somehow it just feels wrong installing obsolete software. I turned my attention to the Visual C++ 2008 libraries, encountering disappointing results.

I gave it a few shots but installing the Visual C++ 2008 Debug CRT always fails, InstallAssembly promptly returning an HRESULT containing ERROR_SXS_PROTECTION_CATALOG_NOT_VALID. A malformed catalog? In one of Microsoft’s very own assemblies? Say it ain’t so!

If you have the hots for deploying the newer runtime, you’ll have to figure out that one on your own, folks. I made sure the catalog for the Visual C++ 9.0 Debug CRT I picked up from the WinSxS store matches the same catalog file found in the MSI merge module (MSM) at C:\Program Files\Common Files\Merge Modules\Microsoft_VC90_DebugCRT_x86.msm by extracting the MSM’s files with the useful MSIX extractor. The catalog files matched and regardless, the SHA-1 hashes for the assembly files matched the catalog rejected by InstallAssembly. Mysterious.

Reviewing the Windows event log following this error didn’t help too much. The System log was now decorated with SideBySide Event ID 20, stating: “The manifest C:\WINDOWS\WinSxS\InstallTemp\160585\Manifests\x86_Microsoft.VC90.DebugCRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_597c3456.Manifest does not match its source catalog or the catalog is missing.” … No newsflash there.

I figured I’ll stick to VC8 support and leave the VC9 troubleshooting for later. If push comes to shove, one can always figure out how to write WiX installers. :)

Instead I opted to review whether this NSIS-based installation approach is compatible with Windows Vista. I was worried that with the restrictive ACLs on the WinSxS store in that OS, without being an MSI and running from the context of the mighty TrustedInstaller.exe process, the installation will surely fail.

I was therefore positively surprised when the test installer worked on Vista. This surprised me since I knew the elevated installer executable ran as Administrator, but there was no denying I was not supposed to be able to copy files to WinSxS:

C:\Users\User\Desktop>cacls C:\Windows\WinSxS
C:\Windows\winsxs NT SERVICE\TrustedInstaller:(OI)(CI)F
BUILTIN\Administrators:(OI)(CI)R
NT AUTHORITY\SYSTEM:(OI)(CI)R
BUILTIN\Users:(OI)(CI)R

There was no denying it. Users and Administrators alike have read-only access to the store, and only the TrustedInstaller service can actually modify it. I opted to run the installer once again, this time in Windbg, tracing the operation of the SxS API to figure out what was happening behind the scenes.


0:000> sxe ld:sxs
0:000> g
ModLoad: 75500000 7555f000 C:\Windows\system32\sxs.dll
eax=1000162a ebx=00000000 ecx=15bf8bb6 edx=00000007 esi=7ffdd000 edi=20000000
eip=76f99a94 esp=0278f620 ebp=0278f664 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!KiFastSystemCallRet:
76f99a94 c3 ret
0:002> bp sxs!CreateAssemblyCache
0:002> g
Breakpoint 0 hit
eax=00000000 ebx=002ceda8 ecx=002cedc8 edx=002ce990 esi=002ceda8 edi=00000000
eip=7554a3aa esp=0278fd54 ebp=0278fd6c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
sxs!CreateAssemblyCache:
7554a3aa 8bff mov edi,edi
0:002> kb 2
*** WARNING: Unable to verify checksum for C:\Users\User\AppData\Local\Temp\nspBB71.tmp\System.dll
ChildEBP RetAddr Args to Child
0278fd50 100024b5 00291a98 00000000 1000162a sxs!CreateAssemblyCache
0278fd6c 1000168d 002ceda8 00000000 75bbc780 System+0x24b5

We know from MSDN that CreateAssemblyCache returns the IAssemblyCache pointer through the first, out parameter. We expect a well-behaved caller to pass in storage initialized to zero, and the storage to contain the newly instantiated interface after the function returns:

0:002> dps 00291a98 L1
00291a98 00000000
0:002> gu
eax=00000000 ebx=002ceda8 ecx=00000000 edx=00000008 esi=002ceda8 edi=00000000
eip=100024b5 esp=0278fd60 ebp=0278fd6c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
System+0x24b5:
100024b5 a340400010 mov dword ptr [System+0x4040 (10004040)],eax ds:0023:10004040={sxs!CreateAssemblyCache (7554a3aa)}
0:002> dps 00291a98 L1
00291a98 00291c88

The first pointer value in the storage reached when following a dereferenced interface pointer is a member function vtable. We verify this and follow the vtable to examine where the implementations of the various members reside:

0:002> dps 00291c88 L1
00291c88 755036f0 sxs!CAssemblyCache::`vftable'
0:002> dps 755036f0 L10
755036f0 75549ac4 sxs!CAssemblyCache::QueryInterface
755036f4 7550de6b sxs!CAssemblyCache::AddRef
755036f8 7554a355 sxs!CAssemblyCache::Release
755036fc 75549d35 sxs!CAssemblyCache::UninstallAssembly
75503700 75549b15 sxs!CAssemblyCache::QueryAssemblyInfo
75503704 7554a219 sxs!CAssemblyCache::CreateAssemblyCacheItem
75503708 755542f1 sxs!XMLParser::SetFlags
7550370c 75549e91 sxs!CAssemblyCache::InstallAssembly
75503710 7554a4d4 sxs!CAssemblyName::QueryInterface
75503714 7550de6b sxs!CAssemblyCache::AddRef
75503718 7554ad69 sxs!CAssemblyName::Release
7550371c 7554a525 sxs!CAssemblyName::SetProperty
75503720 7554a64d sxs!CAssemblyName::GetProperty
75503724 7554a4ba sxs!CAssemblyName::Finalize
75503728 7554aaac sxs!CAssemblyName::GetDisplayName
7550372c 7554a4ad sxs!CAssemblyName::Reserved

It is clear the implementation of the InstallAssembly method is sxs!CAssemblyCache::InstallAssembly. We set up a breakpoint, proceed there and perform a high-level trace:

0:002> bp sxs!CAssemblyCache::InstallAssembly
0:002> g
Breakpoint 1 hit
eax=00000000 ebx=002d2668 ecx=002d2688 edx=002cdc10 esi=002d2668 edi=00000000
eip=75549e91 esp=0278fd4c ebp=0278fd6c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
sxs!CAssemblyCache::InstallAssembly:
75549e91 8bff mov edi,edi
0:002> wt -l 2 -m sxs
Tracing sxs!CAssemblyCache::InstallAssembly to return address 100024b5
16 0 [ 0] sxs!CAssemblyCache::InstallAssembly
13 0 [ 1] sxs!CFrame::CFrame
20 13 [ 0] sxs!CAssemblyCache::InstallAssembly
3 0 [ 1] sxs!CFrame::BaseEnter
11 0 [ 2] sxs!FusionpRtlPushFrame
6 11 [ 1] sxs!CFrame::BaseEnter
54 30 [ 0] sxs!CAssemblyCache::InstallAssembly
6 0 [ 1] sxs!CFrame::ClearLastError
58 36 [ 0] sxs!CAssemblyCache::InstallAssembly
10 0 [ 1] sxs!SxspTranslateReferenceFrom
13 0 [ 2] sxs!CFrame::CFrame
14 13 [ 1] sxs!SxspTranslateReferenceFrom
6 0 [ 2] sxs!CFrame::BaseEnter
70 19 [ 1] sxs!SxspTranslateReferenceFrom
9 0 [ 2] sxs!CFnTracerWin32::~CFnTracerWin32
74 28 [ 1] sxs!SxspTranslateReferenceFrom
63 138 [ 0] sxs!CAssemblyCache::InstallAssembly
4 0 [ 1] sxs!CFrame::ClearLastError
66 142 [ 0] sxs!CAssemblyCache::InstallAssembly
16 0 [ 1] sxs!SxsInstallW
13 0 [ 2] sxs!CFrame::CFrame
20 13 [ 1] sxs!SxsInstallW
6 0 [ 2] sxs!CFrame::BaseEnter
25 19 [ 1] sxs!SxsInstallW
13 0 [ 2] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
28 32 [ 1] sxs!SxsInstallW
13 0 [ 2] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
30 45 [ 1] sxs!SxsInstallW
13 0 [ 2] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
68 58 [ 1] sxs!SxsInstallW
4 0 [ 2] sxs!CFrame::ClearLastError
713 62 [ 1] sxs!SxsInstallW
72 0 [ 2] sxs!SxspExpandRelativePathToFull
797 134 [ 1] sxs!SxsInstallW
4 0 [ 2] sxs!CFrame::ClearLastError
800 138 [ 1] sxs!SxsInstallW
ModLoad: 741e0000 741ea000 C:\Windows\system32\sxsstore.dll
eax=ffffffff ebx=00000000 ecx=002e6a60 edx=00000001 esi=7ffdd000 edi=20000000
eip=76f99a94 esp=0278ed48 ebp=0278ed8c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!KiFastSystemCallRet:
76f99a94 c3 ret

OK, so it looks like this implementation lets sxs!SxsInstallW do the actual work. We rerun the installer and this time perform a trace from that point:

0:000> sxe ld:sxs
0:000> g
ModLoad: 75500000 7555f000 C:\Windows\system32\sxs.dll
eax=1000162a ebx=00000000 ecx=0f78c21d edx=00000007 esi=7ffdc000 edi=20000000
eip=76f99a94 esp=026ef620 ebp=026ef664 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!KiFastSystemCallRet:
76f99a94 c3 ret
0:002> bp sxs!SxsInstallW
0:002> g
Breakpoint 0 hit
eax=026efce8 ebx=0032db90 ecx=026efd08 edx=0032dbac esi=00332a08 edi=026efd44
eip=755475ad esp=026efcd4 ebp=026efd48 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
sxs!SxsInstallW:
755475ad 8bff mov edi,edi
0:002> wt -l 2 -m sxs
Tracing sxs!SxsInstallW to return address 75549f41
16 0 [ 0] sxs!SxsInstallW
13 0 [ 1] sxs!CFrame::CFrame
20 13 [ 0] sxs!SxsInstallW
3 0 [ 1] sxs!CFrame::BaseEnter
11 0 [ 2] sxs!FusionpRtlPushFrame
6 11 [ 1] sxs!CFrame::BaseEnter
25 30 [ 0] sxs!SxsInstallW
10 0 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
12 0 [ 2] sxs!CGenericBaseStringBuffer::InitializeInlineBuffer
13 12 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
28 55 [ 0] sxs!SxsInstallW
10 0 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
12 0 [ 2] sxs!CGenericBaseStringBuffer::InitializeInlineBuffer
13 12 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
30 80 [ 0] sxs!SxsInstallW
10 0 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
12 0 [ 2] sxs!CGenericBaseStringBuffer::InitializeInlineBuffer
13 12 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
68 105 [ 0] sxs!SxsInstallW
4 0 [ 1] sxs!CFrame::ClearLastError
713 109 [ 0] sxs!SxsInstallW
15 0 [ 1] sxs!SxspExpandRelativePathToFull
13 0 [ 2] sxs!CFrame::CFrame
19 13 [ 1] sxs!SxspExpandRelativePathToFull
6 0 [ 2] sxs!CFrame::BaseEnter
22 19 [ 1] sxs!SxspExpandRelativePathToFull
29 0 [ 2] sxs!CGenericStringBufferAccessor::Attach
23 48 [ 1] sxs!SxspExpandRelativePathToFull
4 0 [ 2] sxs!CFrame::ClearLastError
29 52 [ 1] sxs!SxspExpandRelativePathToFull
13 0 [ 2] kernel32!GetFullPathNameW
36 65 [ 1] sxs!SxspExpandRelativePathToFull
31 0 [ 2] sxs!CGenericStringBufferAccessor::Detach
37 96 [ 1] sxs!SxspExpandRelativePathToFull
4 0 [ 2] sxs!CFrame::ClearLastError
41 100 [ 1] sxs!SxspExpandRelativePathToFull
68 0 [ 2] sxs!CGenericBaseStringBuffer::Win32ResizeBuffer
46 168 [ 1] sxs!SxspExpandRelativePathToFull
29 0 [ 2] sxs!CGenericStringBufferAccessor::Attach
47 197 [ 1] sxs!SxspExpandRelativePathToFull
4 0 [ 2] sxs!CFrame::ClearLastError
52 201 [ 1] sxs!SxspExpandRelativePathToFull
13 0 [ 2] kernel32!GetFullPathNameW
64 214 [ 1] sxs!SxspExpandRelativePathToFull
9 0 [ 2] sxs!CFnTracerWin32::~CFnTracerWin32
66 223 [ 1] sxs!SxspExpandRelativePathToFull
20 0 [ 2] sxs!CGenericStringBufferAccessor::~CGenericStringBufferAccessor
72 243 [ 1] sxs!SxspExpandRelativePathToFull
797 424 [ 0] sxs!SxsInstallW
4 0 [ 1] sxs!CFrame::ClearLastError
800 428 [ 0] sxs!SxsInstallW
28 0 [ 1] sxs!SxspGetRemoteStore
5 0 [ 2] sxs!SxspEnsureUserIsAdmin
34 5 [ 1] sxs!SxspGetRemoteStore
9 0 [ 2] kernel32!LoadLibraryW
40 14 [ 1] sxs!SxspGetRemoteStore
18 0 [ 2] ShimEng!StubGetProcAddress
50 32 [ 1] sxs!SxspGetRemoteStore
24 0 [ 2] ole32!CoCreateInstance
>> No match on ret
24 0 [ 2] ole32!CoCreateInstance
5 0 [ 2] RPCRT4!NdrpGetRpcHelper
>> No match on ret
5 0 [ 2] RPCRT4!NdrpGetRpcHelper
17 0 [ 2] RPCRT4!NdrpGetIIDFromBuffer
>> No match on ret
17 0 [ 2] RPCRT4!NdrpGetIIDFromBuffer
71 0 [ 2] RPCRT4!NdrpInterfacePointerUnmarshall
>> No match on ret
71 0 [ 2] RPCRT4!NdrpInterfacePointerUnmarshall
11 0 [ 2] RPCRT4!NdrpPointerUnmarshall
>> No match on ret
11 0 [ 2] RPCRT4!NdrpPointerUnmarshall
4 0 [ 2] RPCRT4!NdrPointerUnmarshall
>> No match on ret
4 0 [ 2] RPCRT4!NdrPointerUnmarshall
19 0 [ 2] RPCRT4!NdrpPointerUnmarshall
>> No match on ret
19 0 [ 2] RPCRT4!NdrpPointerUnmarshall
4 0 [ 2] RPCRT4!NdrPointerUnmarshall
>> No match on ret
4 0 [ 2] RPCRT4!NdrPointerUnmarshall
65 0 [ 2] RPCRT4!NdrpClientUnMarshal
>> No match on ret
65 0 [ 2] RPCRT4!NdrpClientUnMarshal
16 0 [ 2] RPCRT4!NdrClientCall2
>> No match on ret
16 0 [ 2] RPCRT4!NdrClientCall2
8 0 [ 2] RPCRT4!ObjectStublessClient
>> No match on ret
8 0 [ 2] RPCRT4!ObjectStublessClient
4 0 [ 2] RPCRT4!ObjectStubless
65 0 [ 2] ole32!CRpcResolver::CreateInstance
>> No match on ret
65 0 [ 2] ole32!CRpcResolver::CreateInstance
10 0 [ 2] ole32!CClientContextActivator::CreateInstance
>> No match on ret
10 0 [ 2] ole32!CClientContextActivator::CreateInstance
8 0 [ 2] ole32!ActivationPropertiesIn::DelegateCreateInstance
>> No match on ret
8 0 [ 2] ole32!ActivationPropertiesIn::DelegateCreateInstance
53 0 [ 2] ole32!ICoCreateInstanceEx
>> No match on ret
53 0 [ 2] ole32!ICoCreateInstanceEx
21 0 [ 2] ole32!CComActivator::DoCreateInstance
>> No match on ret
21 0 [ 2] ole32!CComActivator::DoCreateInstance
2 0 [ 2] ole32!CoCreateInstanceEx
>> No match on ret
2 0 [ 2] ole32!CoCreateInstanceEx
5 0 [ 2] ole32!CoCreateInstance
62 444 [ 1] sxs!SxspGetRemoteStore
34 0 [ 2] ole32!CStdIdentity::CInternalUnk::QueryInterface
>> No match on ret
34 0 [ 2] ole32!CStdIdentity::CInternalUnk::QueryInterface
14 0 [ 2] ole32!CreateIdentityHandler
>> No match on ret
14 0 [ 2] ole32!CreateIdentityHandler
21 0 [ 2] ole32!UnmarshalInternalObjRef
>> No match on ret
21 0 [ 2] ole32!UnmarshalInternalObjRef
27 0 [ 2] ole32!OXIDEntry::UnmarshalRemUnk
>> No match on ret
27 0 [ 2] ole32!OXIDEntry::UnmarshalRemUnk
18 0 [ 2] ole32!OXIDEntry::MakeRemUnk
>> No match on ret
18 0 [ 2] ole32!OXIDEntry::MakeRemUnk
7 0 [ 2] ole32!OXIDEntry::GetRemUnk
>> No match on ret
7 0 [ 2] ole32!OXIDEntry::GetRemUnk
2 0 [ 2] ole32!CStdMarshal::GetSecureRemUnk
>> No match on ret
2 0 [ 2] ole32!CStdMarshal::GetSecureRemUnk
23 0 [ 2] ole32!CStdMarshal::Begin_RemQIAndUnmarshal1
>> No match on ret
23 0 [ 2] ole32!CStdMarshal::Begin_RemQIAndUnmarshal1
5 0 [ 2] ole32!CStdMarshal::Begin_QueryRemoteInterfaces
>> No match on ret
5 0 [ 2] ole32!CStdMarshal::Begin_QueryRemoteInterfaces
ModLoad: 741e0000 741ea000 C:\Windows\system32\sxsstore.dll

Woah. That’s verbose. There’s a lot of noise in this trace, but the SxspGetRemoteStore function draws attention and it is obvious from all the OLE32 invocations later on that COM is at work here. Examining the sxs!SxspGetRemoteStore function reveals it instantiates the COM object identified by CLSID_SxsStore (left as an exercise for the reader).

Let’s have a look at the object’s registration information. First, extract the CLSID:

0:002> x sxs!CLSID_SxsStore
7554c454 sxs!CLSID_SxsStore =
0:002> dt nt!_GUID 7554c454
ntdll!_GUID
{3c6859ce-230b-48a4-be6c-932c0c202048}
+0x000 Data1 : 0x3c6859ce
+0x004 Data2 : 0x230b
+0x006 Data3 : 0x48a4
+0x008 Data4 : [8] "???"

Now, we’ll use the command-line to see what’s special about this object’s registration:

C:\Users\User\Desktop>reg query HKCR\CLSID\{3c6859ce-230b-48a4-be6c-932c0c202048
} /s


HKEY_CLASSES_ROOT\CLSID\{3c6859ce-230b-48a4-be6c-932c0c202048}
(Default) REG_SZ Sxs Store Class
AppID REG_SZ {752073A2-23F2-4396-85F0-8FDB879ED0ED}

HKEY_CLASSES_ROOT\CLSID\{3c6859ce-230b-48a4-be6c-932c0c202048}\LocalServer32
(Default) REG_EXPAND_SZ %systemroot%\servicing\TrustedInstaller.exe
ThreadingModel REG_SZ Both

OK, that explains it. The SxS API asks an out-of-process COM server running in the context of the TrustedInstaller service to do its bidding, explaining how things work despite the restrictive ACL on the store.

Hope you enjoyed that digression, but now back to the original business at hand. The installation process works just fine on Vista, but the plot thickens when we examine the uninstallation process.

Strangely and contrary to documentation, UninstallAssembly always returns success but with a disposition value of 0 on Vista, and the assembly files remain in place in the WinSxS store no matter what. The bottom line – if you use this approach to deploy the libraries to a Vista system, you may leave behind unused assembly files after your application is uninstalled, cluttering the user’s system. Take this to heart when considering whether this approach and the avoidance of an MSI installer is appropriate for your scenario.

Both the issue of Visual C++ 9.0 assembly deployment using the SxS API and the weird referencing behavior encounter during assembly uninstallation on Vista remain, as of yet, unresolved issues. If anyone is game for figuring those out, I’d be glad to hear about it.