Bridging the gap between native functions and Active Scripting with a COM-based FFI wrapper

May 9, 2008 by Koby Kahane

A few weeks ago I was following the excitement as WebKit, Safari’s browser engine, incrementally passed more and more of the Acid 3 standards test. Wondering if the Gecko (Mozilla Firefox’s rendering engine) folks are also busy with that, I followed both the Planet WebKit and Planet Mozilla feeds for a few weeks.

Sometime in April I stumbled upon this post in Planet Mozilla. It discussed recent improvements to JSctypes. It was the first time I had heard of this project. JSctypes is an XPCOM component for Mozilla that allows calling native (or “foreign”) functions from privileged JavaScript code. Both the interface and name are inspired by the Python ctypes module, included with the standard distribution since version 2.5.

If you haven’t heard of ctypes, take a minute to get acquainted. It’s a great library that allows you to call native C functions dynamically from Python code. Its interface really feels at home in a dynamic language. Most of the time, you can just call functions without specifying the number and types of the arguments they receive. DLL modules can be accessed as attributes of the module attribute matching their calling convention (e.g., ctypes.windll.kernel32 or ctypes.cdll.msvcrt) and script functions can be passed as callbacks to the native APIs being invoked.

JSctypes takes Python’s ctypes concept into Mozilla’s JavaScript implementation. Mozilla has a COM-like architecture at the base of its object model which is called XPCOM. Usually, calling native functionality from JavaScript is achieved by exposing an XPCOM component to script. However, such an approach has clear disadvantages as every conceivable native functionality needs to be wrapped on a case by case basis by a compiled XPCOM component. Now, with JSctypes, Mozilla’s JavaScript code, when privileged (obviously a native call interface is not appropriate in the context of untrusted web content), can call most native functions with relative ease and without a compiled component, aside from JSctypes itself.

A native function interface for a dynamic language needs to deal with the relatively complex task of setting up the call stack frame for an arbitrary native API, according to argument counts, types and alignment requirements deduced dynamically at script execution time. As the interface layer seeks to support a broader and broader variety of argument types (basic data types, then structures, arrays, then callback functions, etc.) the task becomes increasingly complicated and difficult.

I reviewed both JSctypes and Python’s ctypes source code in their respective source code repositories and learned that they both share a common implementation of the lowest component in such a native interface layer. It is called libffi, the Foreign Function Interface library and seems to originate from the gcc project. Since libffi is designed to be compiled with a UNIX-style toolchain (has AT&T syntax assembly files, for instance) and Python needs to compile with Visual C++, the author of ctypes, Thomas Heller, ported an old revision of the library to Visual C++.

Usage of libffi is pretty simple. You initialize an ffi_cif (call information?) structure with the ABI type, return value type, argument count and argument types of the native function to be invoked by using the ffi_prep_cif function. Later, and repeatedly as needed, ffi_call is used to call the actual function with a specific set of argument values, passed in as an array and to retrieve the value returned from the native function.

I thought JSctypes is really cool and it then occurred to me it should not be prohibitively difficult to implement a similar adaptation layer for Microsoft’s JScript and possibly other Active Scripting languages.

In my mind’s eye, I envisioned an in-process COM server accessible to Active Scripting clients (implements IDispatch and associated with a ProgID) providing a call interface to arbitrary native functions.

I created an ATL COM DLL and gave the coclass the ProgID “FunctionWrapper.FunctionWrapper.1″. I knew you could call JScript functions with less or more arguments than they expect in their definition and figured pulling off the same in a native method I’ll expose to the script would be ideal. After a short investigation I learned of the IDL vararg attribute, which accomplishes just what I had in mind. At this point, the exposed interface looks like this:

[
object,
uuid(EBA4A11F-969B-4413-9D4E-FB5CB21039FC),
dual,
nonextensible,
helpstring("IFunctionWrapper Interface"),
pointer_default(unique)
]
interface IFunctionWrapper : IDispatch {
[id(1), helpstring("method CallFunction"), vararg] HRESULT CallFunction([in] SAFEARRAY(VARIANT) args, [out, retval] VARIANT* retVal);
};

The CallFunction method of the FunctionWrapper object is callable by JScript clients with arguments of arbitrary count and type of their choosing. As a simplistic start, I had the first argument specify a string identifying the native function, in the Windbg-inspired syntax of “module!export”, e.g. “user32!MessageBoxW”. The rest of the arguments would be passed to the native function.

I proceeded to implement CFunctionWrapper::CallFunction. The steps taken by the method would be:

  1. Ensure at least the first argument (function to invoke) was given.
  2. Ensure the first argument specifies a module and an export, load the module and retrieve the address of the export.
  3. Thunk the VARIANT arguments received by the method to libffi-style argument and types arrays.
  4. Invoke ffi_prep_cif to prepare the call and call the native function with ffi_call
  5. Thunk the return value of the function into a VARIANT usable by script.

Much of the work here is concise but stage 3 consists of relatively mundane boilerplate, translating two varieties of dynamically typed data, Microsoft’s VARIANT and libffi’s ffi_type. I’ll illustrate with a short snippet:

for (ULONG i = 1; i < arguments.GetCount(); i++)
{
ffi_type* argumentTypes = …; // Dynamically allocated by argument count
void* ffiArgs = …;


VARIANT& arg = arguments[i];
switch (V_VT(&arg)) {
case VT_UI1:
argumentTypes[i - 1] = &ffi_type_uint8;
ffiArgs[i - 1] = &(V_UI1(&arg));
break;

case VT_UI4:
argumentTypes[i - 1] = &ffi_type_uint32;
ffiArgs[i - 1] = &(V_UI4(&arg));
break;
}
}

Similar work is needed for other integer and floating-point types, strings and pointers.

Initially, I hard-coded a return value type of unsigned 32-bit integer and the stdcall calling convention to avoid providing an interface for selecting those parameters. I registered the DLL and tested the following script with WSH:

var functionWrapper = new ActiveXObject(”FunctionWrapper.FunctionWrapper”);
var retVal = functionWrapper.CallFunction(”user32!MessageBoxW”, 0, “text”, “caption”, 1);
WScript.Echo(retVal);

1 is also the value of the MB_OKCANCEL parameter to MessageBox. I used the W variety of the API since I implemented hardcoded UTF16 marshalling for VT_BSTR type variants, which is the form strings come in from JScript.

I was quite content when the test script not only failed to crash the WSH process, but also successfully presented a message box and provided the API’s return value successfully back to JScript.

At this point I considered what would it take to extend this solution beyond the basic value types. Arrays first came to mind. Such support, I imagined, would consist of copying an incoming SAFEARRAY argument into a native array and supplying the native array pointer to the native function. If “out” array argument support is desired, copying back into the SAFEARRAY would be required post-invocation, right after ffi_call.

Next in line were structs. These would be less straightforward. The problem with filling a JScript “object” (read, hash table) with a struct’s fields is that ordering would not be preserved as the order in the struct’s data layout. Using the hash as a JScript array would provide ordering, although it wouldn’t be very nice looking.

The final type of argument I considered, and arguably the most important, is callbacks. Many APIs take function pointers as arguments. Consider EnumWindows which invokes EnumWindowsProc on every window found. A native call interface should provide a capability to implement the callback as JScript function and pass it as seamlessly as possible during the native invocation.

Fortunately, libffi provides built-in support for callbacks, calling them “closures” in its terminology. An ffi_cif structure is initialized to describe the prototype of the callback function, in native eyes, as it if it were going to be called with ffi_call. ffi_prep_closure takes such a prototype description, a function pointer and a closure “trampoline buffer”, as I call it. The trampoline buffer, expected to be allocated in writable, executable memory (native code would later jump into its address) takes care of calling the provided function pointer. The twist is that the function pointer, instead of being called with a dynamic prototype, always receives its arguments in the form of libffi argument arrays.

The native callback function wrapped by the closure trampoline buffer would presumably fill a SAFEARRAY of variants with the arguments and invoke a script function. A wrapper callback coclass could be provided to the script and allow for more elaborate stuff like out parameters and the like. An instance of the callback object would wrap a JScript function object and invoke its apply method using the IDispatch interface as calls come in through the closure. It is unclear what a generic solution that doesn’t rely on functions being objects and having the apply method would look like, so at this point this wrapper callback concept is only suitable for JScript.

Right now I only got as far as implementing just the basic value types, and even that with code of such poor quality I avoid uploading it for the time being. The devil is in the details and supporting describing complex argument types would require quite a bit of work. Hopefully someday I or perhaps an enthusiastic reader would get around to coding and publishing a full-fledged implementation of a native call interface. Embedding such an interface in an Active Scripting host in scenarios where the hosted scripts enjoy full trust could provide endless extensibility possibilities for the script author.

Hey, cooler than P/Invoke…

Windbg 6.9.3.113 released

May 3, 2008 by Koby Kahane

A new version of the Debugging Tools for Windows appeared, quietly as usual, on Microsoft’s web site a few days ago.

Unfortunately the debugging symbols package for Windows XP SP3 is still MIA, presumably being delayed along with widespread SP3 availability on the Download Center and Windows Update. My local symbol store has grown quite obese with all the SP2 patches over the years so I’m looking forward to clean things up once that’s available.

Nothing too exciting in the RELNOTES.TXT for this release. Integrated managed debugging remains dysfunctional so the trusty 6.7.5.0 remains in place for that. Can’t even get SOS to break on an application’s Main method. The most exciting feature is enhancements to the “dt” command.

Yawn.

Deploying the Visual C++ libraries with an NSIS installer

April 12, 2008 by Koby Kahane

Beginning with Visual C++ 2005 and continuing into Visual C++ 2008 and the foreseeable future, Microsoft’s various runtime libraries (CRT, ATL, MFC, etc.) are no longer installed into the system32 directory on Windows XP and later, but are rather “side-by-side assemblies” that need to be installed into the side-by-side store, “WinSxS”, in order to be available to all applications.

I’ve discussed the SxS store and the API Microsoft has documented for managing it in a previous post. Nevertheless, at the request of the NSIS maintainer, kichik, I’ll provide some guidance on the issue of runtime deployment and concrete examples to authors of NSIS-based installations. Do keep in mind that I am not adept at authoring NSIS installers and questions beyond the realm of the matter at hand are best targeted at the NSIS forum.

Unlike in the Linux world, where the C runtime library is considered an operating system component and versions of it are never installed by applications (at worst, some proprietary application is linked against an antique version of libc and requires the system administrator to install a compatibility package provided by the distribution), the CRT situation on Windows is more complicated. In the days of yore, Windows NT provided the now long defunct CRTDLL.DLL. Later, the newer variant MSVCRT.DLL shipped with Visual C++, going into the 6.0 release. However, in addition to serving as the runtime of a specific Visual C++ version, MSVCRT.DLL doubles as the “OS CRT”, the version of the C++ runtime deployed with the OS as far back as NT 4.0 and going into Windows Vista. Components included with the operating system itself, such as Notepad and Calculator, are linked against this CRT dynamically. Do not let the identical moniker fool you, the CRTs included with the various NT releases diverge significantly, sporting, for example, a brand new exception handling runtime in Windows Vista, aligned with newer Visual C++ compilers.

The existence of several MSVCRT.DLL variants and the associated versioning issues are probably what led Microsoft to adopt a policy of strongly versioned CRTs beginning with the Visual C++ .NET (2002) release. MSVCR70.DLL was the runtime required by the output of that product, and later versions would require deploying MSVCR71.DLL, MSVCR80.DLL and most recently MSVCR90.DLL. In addition to the CRT itself, there are also the various peripheral libraries that some applications may depend on, such as ATL and MFC.

I’ve discussed in the past an approach utilizing the Windows Driver Kit build environment that allows combining a modern C++ compiler with targeting the Visual C++ 6.0 / OS CRT, MSVCRT.DLL. It is for the brave who don’t mind getting their hands dirty and whose desire to target the broadly deployed runtime exceeds the fear of the plethora of potential version compatibility issues such an application configuration can cause.

For the more conservative lot, the question remains, how do I get the new C++ runtimes to my end-user’s machine? The first approach is that of utilizing static linking. It should be avoided at nearly all cost as it results in both obese executables that are unable to share the runtime’s memory pages with other running processes and is completely unservicable by Microsoft when a security update or another bug fix to all users of the runtime libraries needs to be broadly deployed.

We therefore turn our attention to approaches based on dynamic linking. First of all, the reader should review the official guidance provided by the Visual C++ team on the matter, although he or she may not like what they read. To summarize, Microsoft officially supports the following deployment methods:

  • Use an MSI-based (Windows Installer) installation procedure and utilize their MSM merge modules to include whatever runtime components you require with your application. The MSMs are black box magic that will get those runtime libraries into the “winsxs” store without asking too many questions. If you don’t like those massively complicated MSI installers and the WiX XML schemas make your head spin, that’s too bad.
  • Use the obese VCRedist.exe for the target architecture, without the benefit of picking and choosing only those runtime components that are of interest for your specific application.
  • Deploy the runtime libraries as files in your application’s directory, or “private assemblies” in SxS nomenclature, wasting the end-user’s hard disk space with multiple copies. This is not as bad as it seems, since at least SxS redirection policies can make an updated, security patched version from the “winsxs” store be loaded in place of out of date version deployed privately with the application, unlike with classic non-SxS local DLLs or with static linking.

As the popularity of NSIS as an installation apparatus shows, not everyone are willing to be strong-armed into an MSI-based installation just yet. So how do CRT deploying installers address this acute issue? I was disappointed, but not surprised, to see that VLC, DivX and various other applications with NSIS-based installers, opt for the “private assembly” approach, simplifying life for the installation author but needlessly wasting end-user disk real estate.

The now documented SxS API provides an alternative approach, presumably supported by Microsoft for deploying SxS assemblies in general (such as your own) but not specifically by the Visual C++ folks for theirs. The motivation for this lack of support is unclear, since the end result is as servicable by them as is using Windows Installer merge modules. Nevertheless, it is something that those who follow this path should be aware of.

OK, so let’s get on with it. Unlike with system32, we can’t just waltz into winsxs and drop our assembly’s files there. The directory structure is complicated, differs between XP and Vista, and in fact the ACL on the directory in Vista won’t allow anyone but TrustedInstaller (i.e., MSI) to touch it. Therefore we are required to perform the installation through the SxS API, which provides a COM-based interface for manipulating the store.

For illustration purposes, I shall use the Visual C++ 2005 (8.0) Debug CRT. Note that this is not the CRT you want to deploy to your end users, and in any case is explicitly NOT redistributable by Microsoft’s license terms. I use it for illustrative convenience since my XP virtual machine doesn’t have this assembly. We’ll use an NSIS installer script to drive the wonderful though peculiar System plug-in and get it to invoke the SxS API. Note that elaborate error handling is omitted for brevity. So here we go:


Name “NSIS SxS Test”
OutFile “nsissxs.exe”
SetPluginUnload alwaysoff
ShowInstDetails show
XPStyle on
SetCompressor /SOLID lzma
InstallDir $PROGRAMFILES\NSISSxS


!define FUSION_REFCOUNT_UNINSTALL_SUBKEY_GUID {8cedc215-ac4b-488b-93c0-a50a49cb2fb8}

Section “Uninstall”
DeleteRegKey “HKLM” “Software\Microsoft\Windows\CurrentVersion\Uninstall\nsissxs”
Delete $INSTDIR\uninst.exe
Delete $INSTDIR\dummy.txt
RMDir $INSTDIR
DetailPrint “Removing DebugCRT assembly…”
System::Call “sxs::CreateAssemblyCache(*i .r0, i 0) i.r1″
StrCmp $1 0 0 fail
System::Call “*(i 32, i 0, i 2364391957, i 1217113163, i 178634899, i 3090139977, w ‘nsissxs’, w ”) i.s”
Pop $2
System::Call “$0->3(i 0, w ‘Microsoft.VC80.DebugCRT,version=$\”8.0.50727.762$\”,type=$\”win32$\”,processorArchitecture=$\”x86$\”,publicKeyToken=$\”1fc8b3b9a1e18e3b$\”‘, i r2, *i . r3) i.r1″
StrCmp $1 0 0 fail2
DetailPrint “Disposition returned is $3″
System::Call “$0->2()”
Goto end
fail:
DetailPrint “CreateAssemblyCache failed.”
DetailPrint $1
Goto end
fail2:
DetailPrint “UninstallAssembly failed.”
DetailPrint $1
Goto end
end:
SectionEnd

Section
SetOutPath $INSTDIR
File “dummy.txt”
WriteUninstaller $INSTDIR\uninst.exe
WriteRegStr “HKLM” “Software\Microsoft\Windows\CurrentVersion\Uninstall\nsissxs” “DisplayName” “NSIS SxS Test”
WriteRegStr “HKLM” “Software\Microsoft\Windows\CurrentVersion\Uninstall\nsissxs” “UninstallString” “$INSTDIR\uninst.exe”
InitPluginsDir
SetOutPath $PLUGINSDIR
File “msvcm80d.dll”
File “msvcp80d.dll”
File “msvcr80d.dll”
File “x86_Microsoft.VC80.DebugCRT_1fc8b3b9a1e18e3b_8.0.50727.762_x-ww_5490cd9f.cat”
File “x86_Microsoft.VC80.DebugCRT_1fc8b3b9a1e18e3b_8.0.50727.762_x-ww_5490cd9f.manifest”

DetailPrint “Installing DebugCRT assembly…”
System::Call “sxs::CreateAssemblyCache(*i .r0, i 0) i.r1″
StrCmp $1 0 0 fail
# Fill a FUSION_INSTALL_REFERENCE.
# fir.cbSize = sizeof(FUSION_INSTALL_REFERENCE) == 32
# fir.dwFlags = 0
# fir.guidScheme = FUSION_REFCOUNT_UNINSTALL_SUBKEY_GUID
# fir.szIdentifier = “nsissxs”
# fir.szNonCanonicalData = 0
System::Call “*(i 32, i 0, i 2364391957, i 1217113163, i 178634899, i 3090139977, w ‘nsissxs’, w ”) i.s”
Pop $2
# IAssemblyCache::InstallAssembly(0, manifestPath, fir)
System::Call “$0->7(i 0, w ‘$PLUGINSDIR\x86_Microsoft.VC80.DebugCRT_1fc8b3b9a1e18e3b_8.0.50727.762_x-ww_5490cd9f.manifest’, i r2) i.r1″
System::Free $2
StrCmp $1 0 0 fail2
System::Call “$0->2()”
Goto end
fail:
DetailPrint “CreateAssemblyCache failed.”
DetailPrint $1
Goto end
fail2:
DetailPrint “InstallAssembly failed.”
DetailPrint $1
Goto end
end:
SectionEnd

If you are not familiar with NSIS script syntax, now would be a good time to get acquainted. Let us review the contents of the 2nd section, which is the install section. The dummy file is a placeholder for the actual files your installer wants to deploy. Next, we set up an Uninstall entry in the registry as one usually would. Now on to the interesting part.

In order to deploy a SxS assembly, we must place its DLLs together in a temporary directory created by the installer. Note that if an assembly contains several DLLs, we cannot pick and choose only those that our application links with. The assembly is deployed, versioned and bound as a whole. We can figure out which files are part of the assembly in question by reviewing the assembly manifest, which we’ll find installed into the Manifests subdirectory of the WinSxS store on a Windows XP system. If we review the Debug CRT’s manifest, we can see <file> nodes under the <assembly> node, each referencing one of the files that must be deployed with the assembly. You can find the actual assembly files under the subdirectory with the assembly’s strong name in the WinSxS store.

In addition to the DLL files themselves, the assembly manifest and the assembly signing catalog are an integral part of the assembly. The catalog ensures the integrity of the assembly and is a welcome feature over traditional DLL deployment.

With the DLLs, assembly manifest and catalog in place, we are ready to invoke the SxS API for assembly installation. First, we call CreateAssemblyCache to retrieve the IAssemblyCache interface for managing the SxS store. Note that in the context of an NSIS installer, COM has already been initialized (for STA use) at this point, but if you are making a custom installer in another environment you may have to take care of that before reaching this point.

Assuming all goes well the next phase is setting up the FUSION_INSTALL_REFERENCE structure that will describe our assembly installation. Typically, you’ll want the reference to be associated with the registry Uninstall key for your application. Besides, other reference types do not seem to work too well and the documentation doesn’t err on the side of verbosity.

The not-so-seasoned NSIS scripter that I am, I couldn’t figure out a more legible way to specify the GUID argument to the InstallAssembly invocation so I broke down its components by hand. Counting vtable indices including the IUnknown and IAssemblyCache interfaces, InstallAssembly is at vtable slot 7. After the install reference structure is set up and the method invoked, we hope for the best.

Assuming a successful install transaction, we proceed to call the IUnknown method Release (vtable slot 2) to free the SxS cache manager and deem our install sequence completed.

We now turn our attention to the reverse sequence in the Uninstall section of the illustration installer. Being good citizens of the Windows ecosystem, we remove our reference to the shared assembly when the end-user removes our application from their system. WinSxS manages assembly reference counting and will figure out whether the assembly files should actually be removed from the disk.

We create an IAssemblyCache interface instance as before but this time call UninstallAssembly to remove our reference. This is the first method of the interface but is preceded by the IUnknown members and is thus at vtable slot 3. Following a successful invocation we can examine the returned Disposition value if it is of interest and proceed to free the instance.

Note that we remove an assembly by its full, strong name and not by path. You can figure out the assembly’s strong name from its manifest.

OK, installing the VC8 Debug CRT was easy enough. Note that other libraries (ATL, etc.) you’ll want to install may have dependencies on other assemblies, so make sure you get your install sequence in order.

Installing the Visual C++ 2005 runtime is nice and all, but somehow it just feels wrong installing obsolete software. I turned my attention to the Visual C++ 2008 libraries, encountering disappointing results.

I gave it a few shots but installing the Visual C++ 2008 Debug CRT always fails, InstallAssembly promptly returning an HRESULT containing ERROR_SXS_PROTECTION_CATALOG_NOT_VALID. A malformed catalog? In one of Microsoft’s very own assemblies? Say it ain’t so!

If you have the hots for deploying the newer runtime, you’ll have to figure out that one on your own, folks. I made sure the catalog for the Visual C++ 9.0 Debug CRT I picked up from the WinSxS store matches the same catalog file found in the MSI merge module (MSM) at C:\Program Files\Common Files\Merge Modules\Microsoft_VC90_DebugCRT_x86.msm by extracting the MSM’s files with the useful MSIX extractor. The catalog files matched and regardless, the SHA-1 hashes for the assembly files matched the catalog rejected by InstallAssembly. Mysterious.

Reviewing the Windows event log following this error didn’t help too much. The System log was now decorated with SideBySide Event ID 20, stating: “The manifest C:\WINDOWS\WinSxS\InstallTemp\160585\Manifests\x86_Microsoft.VC90.DebugCRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_597c3456.Manifest does not match its source catalog or the catalog is missing.” … No newsflash there.

I figured I’ll stick to VC8 support and leave the VC9 troubleshooting for later. If push comes to shove, one can always figure out how to write WiX installers. :)

Instead I opted to review whether this NSIS-based installation approach is compatible with Windows Vista. I was worried that with the restrictive ACLs on the WinSxS store in that OS, without being an MSI and running from the context of the mighty TrustedInstaller.exe process, the installation will surely fail.

I was therefore positively surprised when the test installer worked on Vista. This surprised me since I knew the elevated installer executable ran as Administrator, but there was no denying I was not supposed to be able to copy files to WinSxS:

C:\Users\User\Desktop>cacls C:\Windows\WinSxS
C:\Windows\winsxs NT SERVICE\TrustedInstaller:(OI)(CI)F
BUILTIN\Administrators:(OI)(CI)R
NT AUTHORITY\SYSTEM:(OI)(CI)R
BUILTIN\Users:(OI)(CI)R

There was no denying it. Users and Administrators alike have read-only access to the store, and only the TrustedInstaller service can actually modify it. I opted to run the installer once again, this time in Windbg, tracing the operation of the SxS API to figure out what was happening behind the scenes.


0:000> sxe ld:sxs
0:000> g
ModLoad: 75500000 7555f000 C:\Windows\system32\sxs.dll
eax=1000162a ebx=00000000 ecx=15bf8bb6 edx=00000007 esi=7ffdd000 edi=20000000
eip=76f99a94 esp=0278f620 ebp=0278f664 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!KiFastSystemCallRet:
76f99a94 c3 ret
0:002> bp sxs!CreateAssemblyCache
0:002> g
Breakpoint 0 hit
eax=00000000 ebx=002ceda8 ecx=002cedc8 edx=002ce990 esi=002ceda8 edi=00000000
eip=7554a3aa esp=0278fd54 ebp=0278fd6c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
sxs!CreateAssemblyCache:
7554a3aa 8bff mov edi,edi
0:002> kb 2
*** WARNING: Unable to verify checksum for C:\Users\User\AppData\Local\Temp\nspBB71.tmp\System.dll
ChildEBP RetAddr Args to Child
0278fd50 100024b5 00291a98 00000000 1000162a sxs!CreateAssemblyCache
0278fd6c 1000168d 002ceda8 00000000 75bbc780 System+0×24b5

We know from MSDN that CreateAssemblyCache returns the IAssemblyCache pointer through the first, out parameter. We expect a well-behaved caller to pass in storage initialized to zero, and the storage to contain the newly instantiated interface after the function returns:

0:002> dps 00291a98 L1
00291a98 00000000
0:002> gu
eax=00000000 ebx=002ceda8 ecx=00000000 edx=00000008 esi=002ceda8 edi=00000000
eip=100024b5 esp=0278fd60 ebp=0278fd6c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
System+0×24b5:
100024b5 a340400010 mov dword ptr [System+0x4040 (10004040)],eax ds:0023:10004040={sxs!CreateAssemblyCache (7554a3aa)}
0:002> dps 00291a98 L1
00291a98 00291c88

The first pointer value in the storage reached when following a dereferenced interface pointer is a member function vtable. We verify this and follow the vtable to examine where the implementations of the various members reside:

0:002> dps 00291c88 L1
00291c88 755036f0 sxs!CAssemblyCache::`vftable’
0:002> dps 755036f0 L10
755036f0 75549ac4 sxs!CAssemblyCache::QueryInterface
755036f4 7550de6b sxs!CAssemblyCache::AddRef
755036f8 7554a355 sxs!CAssemblyCache::Release
755036fc 75549d35 sxs!CAssemblyCache::UninstallAssembly
75503700 75549b15 sxs!CAssemblyCache::QueryAssemblyInfo
75503704 7554a219 sxs!CAssemblyCache::CreateAssemblyCacheItem
75503708 755542f1 sxs!XMLParser::SetFlags
7550370c 75549e91 sxs!CAssemblyCache::InstallAssembly
75503710 7554a4d4 sxs!CAssemblyName::QueryInterface
75503714 7550de6b sxs!CAssemblyCache::AddRef
75503718 7554ad69 sxs!CAssemblyName::Release
7550371c 7554a525 sxs!CAssemblyName::SetProperty
75503720 7554a64d sxs!CAssemblyName::GetProperty
75503724 7554a4ba sxs!CAssemblyName::Finalize
75503728 7554aaac sxs!CAssemblyName::GetDisplayName
7550372c 7554a4ad sxs!CAssemblyName::Reserved

It is clear the implementation of the InstallAssembly method is sxs!CAssemblyCache::InstallAssembly. We set up a breakpoint, proceed there and perform a high-level trace:

0:002> bp sxs!CAssemblyCache::InstallAssembly
0:002> g
Breakpoint 1 hit
eax=00000000 ebx=002d2668 ecx=002d2688 edx=002cdc10 esi=002d2668 edi=00000000
eip=75549e91 esp=0278fd4c ebp=0278fd6c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
sxs!CAssemblyCache::InstallAssembly:
75549e91 8bff mov edi,edi
0:002> wt -l 2 -m sxs
Tracing sxs!CAssemblyCache::InstallAssembly to return address 100024b5
16 0 [ 0] sxs!CAssemblyCache::InstallAssembly
13 0 [ 1] sxs!CFrame::CFrame
20 13 [ 0] sxs!CAssemblyCache::InstallAssembly
3 0 [ 1] sxs!CFrame::BaseEnter
11 0 [ 2] sxs!FusionpRtlPushFrame
6 11 [ 1] sxs!CFrame::BaseEnter
54 30 [ 0] sxs!CAssemblyCache::InstallAssembly
6 0 [ 1] sxs!CFrame::ClearLastError
58 36 [ 0] sxs!CAssemblyCache::InstallAssembly
10 0 [ 1] sxs!SxspTranslateReferenceFrom
13 0 [ 2] sxs!CFrame::CFrame
14 13 [ 1] sxs!SxspTranslateReferenceFrom
6 0 [ 2] sxs!CFrame::BaseEnter
70 19 [ 1] sxs!SxspTranslateReferenceFrom
9 0 [ 2] sxs!CFnTracerWin32::~CFnTracerWin32
74 28 [ 1] sxs!SxspTranslateReferenceFrom
63 138 [ 0] sxs!CAssemblyCache::InstallAssembly
4 0 [ 1] sxs!CFrame::ClearLastError
66 142 [ 0] sxs!CAssemblyCache::InstallAssembly
16 0 [ 1] sxs!SxsInstallW
13 0 [ 2] sxs!CFrame::CFrame
20 13 [ 1] sxs!SxsInstallW
6 0 [ 2] sxs!CFrame::BaseEnter
25 19 [ 1] sxs!SxsInstallW
13 0 [ 2] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
28 32 [ 1] sxs!SxsInstallW
13 0 [ 2] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
30 45 [ 1] sxs!SxsInstallW
13 0 [ 2] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
68 58 [ 1] sxs!SxsInstallW
4 0 [ 2] sxs!CFrame::ClearLastError
713 62 [ 1] sxs!SxsInstallW
72 0 [ 2] sxs!SxspExpandRelativePathToFull
797 134 [ 1] sxs!SxsInstallW
4 0 [ 2] sxs!CFrame::ClearLastError
800 138 [ 1] sxs!SxsInstallW
ModLoad: 741e0000 741ea000 C:\Windows\system32\sxsstore.dll
eax=ffffffff ebx=00000000 ecx=002e6a60 edx=00000001 esi=7ffdd000 edi=20000000
eip=76f99a94 esp=0278ed48 ebp=0278ed8c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!KiFastSystemCallRet:
76f99a94 c3 ret

OK, so it looks like this implementation lets sxs!SxsInstallW do the actual work. We rerun the installer and this time perform a trace from that point:

0:000> sxe ld:sxs
0:000> g
ModLoad: 75500000 7555f000 C:\Windows\system32\sxs.dll
eax=1000162a ebx=00000000 ecx=0f78c21d edx=00000007 esi=7ffdc000 edi=20000000
eip=76f99a94 esp=026ef620 ebp=026ef664 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!KiFastSystemCallRet:
76f99a94 c3 ret
0:002> bp sxs!SxsInstallW
0:002> g
Breakpoint 0 hit
eax=026efce8 ebx=0032db90 ecx=026efd08 edx=0032dbac esi=00332a08 edi=026efd44
eip=755475ad esp=026efcd4 ebp=026efd48 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
sxs!SxsInstallW:
755475ad 8bff mov edi,edi
0:002> wt -l 2 -m sxs
Tracing sxs!SxsInstallW to return address 75549f41
16 0 [ 0] sxs!SxsInstallW
13 0 [ 1] sxs!CFrame::CFrame
20 13 [ 0] sxs!SxsInstallW
3 0 [ 1] sxs!CFrame::BaseEnter
11 0 [ 2] sxs!FusionpRtlPushFrame
6 11 [ 1] sxs!CFrame::BaseEnter
25 30 [ 0] sxs!SxsInstallW
10 0 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
12 0 [ 2] sxs!CGenericBaseStringBuffer::InitializeInlineBuffer
13 12 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
28 55 [ 0] sxs!SxsInstallW
10 0 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
12 0 [ 2] sxs!CGenericBaseStringBuffer::InitializeInlineBuffer
13 12 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
30 80 [ 0] sxs!SxsInstallW
10 0 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
12 0 [ 2] sxs!CGenericBaseStringBuffer::InitializeInlineBuffer
13 12 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
68 105 [ 0] sxs!SxsInstallW
4 0 [ 1] sxs!CFrame::ClearLastError
713 109 [ 0] sxs!SxsInstallW
15 0 [ 1] sxs!SxspExpandRelativePathToFull
13 0 [ 2] sxs!CFrame::CFrame
19 13 [ 1] sxs!SxspExpandRelativePathToFull
6 0 [ 2] sxs!CFrame::BaseEnter
22 19 [ 1] sxs!SxspExpandRelativePathToFull
29 0 [ 2] sxs!CGenericStringBufferAccessor::Attach
23 48 [ 1] sxs!SxspExpandRelativePathToFull
4 0 [ 2] sxs!CFrame::ClearLastError
29 52 [ 1] sxs!SxspExpandRelativePathToFull
13 0 [ 2] kernel32!GetFullPathNameW
36 65 [ 1] sxs!SxspExpandRelativePathToFull
31 0 [ 2] sxs!CGenericStringBufferAccessor::Detach
37 96 [ 1] sxs!SxspExpandRelativePathToFull
4 0 [ 2] sxs!CFrame::ClearLastError
41 100 [ 1] sxs!SxspExpandRelativePathToFull
68 0 [ 2] sxs!CGenericBaseStringBuffer::Win32ResizeBuffer
46 168 [ 1] sxs!SxspExpandRelativePathToFull
29 0 [ 2] sxs!CGenericStringBufferAccessor::Attach
47 197 [ 1] sxs!SxspExpandRelativePathToFull
4 0 [ 2] sxs!CFrame::ClearLastError
52 201 [ 1] sxs!SxspExpandRelativePathToFull
13 0 [ 2] kernel32!GetFullPathNameW
64 214 [ 1] sxs!SxspExpandRelativePathToFull
9 0 [ 2] sxs!CFnTracerWin32::~CFnTracerWin32
66 223 [ 1] sxs!SxspExpandRelativePathToFull
20 0 [ 2] sxs!CGenericStringBufferAccessor::~CGenericStringBufferAccessor
72 243 [ 1] sxs!SxspExpandRelativePathToFull
797 424 [ 0] sxs!SxsInstallW
4 0 [ 1] sxs!CFrame::ClearLastError
800 428 [ 0] sxs!SxsInstallW
28 0 [ 1] sxs!SxspGetRemoteStore
5 0 [ 2] sxs!SxspEnsureUserIsAdmin
34 5 [ 1] sxs!SxspGetRemoteStore
9 0 [ 2] kernel32!LoadLibraryW
40 14 [ 1] sxs!SxspGetRemoteStore
18 0 [ 2] ShimEng!StubGetProcAddress
50 32 [ 1] sxs!SxspGetRemoteStore
24 0 [ 2] ole32!CoCreateInstance
>> No match on ret
24 0 [ 2] ole32!CoCreateInstance
5 0 [ 2] RPCRT4!NdrpGetRpcHelper
>> No match on ret
5 0 [ 2] RPCRT4!NdrpGetRpcHelper
17 0 [ 2] RPCRT4!NdrpGetIIDFromBuffer
>> No match on ret
17 0 [ 2] RPCRT4!NdrpGetIIDFromBuffer
71 0 [ 2] RPCRT4!NdrpInterfacePointerUnmarshall
>> No match on ret
71 0 [ 2] RPCRT4!NdrpInterfacePointerUnmarshall
11 0 [ 2] RPCRT4!NdrpPointerUnmarshall
>> No match on ret
11 0 [ 2] RPCRT4!NdrpPointerUnmarshall
4 0 [ 2] RPCRT4!NdrPointerUnmarshall
>> No match on ret
4 0 [ 2] RPCRT4!NdrPointerUnmarshall
19 0 [ 2] RPCRT4!NdrpPointerUnmarshall
>> No match on ret
19 0 [ 2] RPCRT4!NdrpPointerUnmarshall
4 0 [ 2] RPCRT4!NdrPointerUnmarshall
>> No match on ret
4 0 [ 2] RPCRT4!NdrPointerUnmarshall
65 0 [ 2] RPCRT4!NdrpClientUnMarshal
>> No match on ret
65 0 [ 2] RPCRT4!NdrpClientUnMarshal
16 0 [ 2] RPCRT4!NdrClientCall2
>> No match on ret
16 0 [ 2] RPCRT4!NdrClientCall2
8 0 [ 2] RPCRT4!ObjectStublessClient
>> No match on ret
8 0 [ 2] RPCRT4!ObjectStublessClient
4 0 [ 2] RPCRT4!ObjectStubless
65 0 [ 2] ole32!CRpcResolver::CreateInstance
>> No match on ret
65 0 [ 2] ole32!CRpcResolver::CreateInstance
10 0 [ 2] ole32!CClientContextActivator::CreateInstance
>> No match on ret
10 0 [ 2] ole32!CClientContextActivator::CreateInstance
8 0 [ 2] ole32!ActivationPropertiesIn::DelegateCreateInstance
>> No match on ret
8 0 [ 2] ole32!ActivationPropertiesIn::DelegateCreateInstance
53 0 [ 2] ole32!ICoCreateInstanceEx
>> No match on ret
53 0 [ 2] ole32!ICoCreateInstanceEx
21 0 [ 2] ole32!CComActivator::DoCreateInstance
>> No match on ret
21 0 [ 2] ole32!CComActivator::DoCreateInstance
2 0 [ 2] ole32!CoCreateInstanceEx
>> No match on ret
2 0 [ 2] ole32!CoCreateInstanceEx
5 0 [ 2] ole32!CoCreateInstance
62 444 [ 1] sxs!SxspGetRemoteStore
34 0 [ 2] ole32!CStdIdentity::CInternalUnk::QueryInterface
>> No match on ret
34 0 [ 2] ole32!CStdIdentity::CInternalUnk::QueryInterface
14 0 [ 2] ole32!CreateIdentityHandler
>> No match on ret
14 0 [ 2] ole32!CreateIdentityHandler
21 0 [ 2] ole32!UnmarshalInternalObjRef
>> No match on ret
21 0 [ 2] ole32!UnmarshalInternalObjRef
27 0 [ 2] ole32!OXIDEntry::UnmarshalRemUnk
>> No match on ret
27 0 [ 2] ole32!OXIDEntry::UnmarshalRemUnk
18 0 [ 2] ole32!OXIDEntry::MakeRemUnk
>> No match on ret
18 0 [ 2] ole32!OXIDEntry::MakeRemUnk
7 0 [ 2] ole32!OXIDEntry::GetRemUnk
>> No match on ret
7 0 [ 2] ole32!OXIDEntry::GetRemUnk
2 0 [ 2] ole32!CStdMarshal::GetSecureRemUnk
>> No match on ret
2 0 [ 2] ole32!CStdMarshal::GetSecureRemUnk
23 0 [ 2] ole32!CStdMarshal::Begin_RemQIAndUnmarshal1
>> No match on ret
23 0 [ 2] ole32!CStdMarshal::Begin_RemQIAndUnmarshal1
5 0 [ 2] ole32!CStdMarshal::Begin_QueryRemoteInterfaces
>> No match on ret
5 0 [ 2] ole32!CStdMarshal::Begin_QueryRemoteInterfaces
ModLoad: 741e0000 741ea000 C:\Windows\system32\sxsstore.dll

Woah. That’s verbose. There’s a lot of noise in this trace, but the SxspGetRemoteStore function draws attention and it is obvious from all the OLE32 invocations later on that COM is at work here. Examining the sxs!SxspGetRemoteStore function reveals it instantiates the COM object identified by CLSID_SxsStore (left as an exercise for the reader).

Let’s have a look at the object’s registration information. First, extract the CLSID:

0:002> x sxs!CLSID_SxsStore
7554c454 sxs!CLSID_SxsStore =
0:002> dt nt!_GUID 7554c454
ntdll!_GUID
{3c6859ce-230b-48a4-be6c-932c0c202048}
+0×000 Data1 : 0×3c6859ce
+0×004 Data2 : 0×230b
+0×006 Data3 : 0×48a4
+0×008 Data4 : [8] “???”

Now, we’ll use the command-line to see what’s special about this object’s registration:

C:\Users\User\Desktop>reg query HKCR\CLSID\{3c6859ce-230b-48a4-be6c-932c0c202048
} /s


HKEY_CLASSES_ROOT\CLSID\{3c6859ce-230b-48a4-be6c-932c0c202048}
(Default) REG_SZ Sxs Store Class
AppID REG_SZ {752073A2-23F2-4396-85F0-8FDB879ED0ED}

HKEY_CLASSES_ROOT\CLSID\{3c6859ce-230b-48a4-be6c-932c0c202048}\LocalServer32
(Default) REG_EXPAND_SZ %systemroot%\servicing\TrustedInstaller.exe
ThreadingModel REG_SZ Both

OK, that explains it. The SxS API asks an out-of-process COM server running in the context of the TrustedInstaller service to do its bidding, explaining how things work despite the restrictive ACL on the store.

Hope you enjoyed that digression, but now back to the original business at hand. The installation process works just fine on Vista, but the plot thickens when we examine the uninstallation process.

Strangely and contrary to documentation, UninstallAssembly always returns success but with a disposition value of 0 on Vista, and the assembly files remain in place in the WinSxS store no matter what. The bottom line - if you use this approach to deploy the libraries to a Vista system, you may leave behind unused assembly files after your application is uninstalled, cluttering the user’s system. Take this to heart when considering whether this approach and the avoidance of an MSI installer is appropriate for your scenario.

Both the issue of Visual C++ 9.0 assembly deployment using the SxS API and the weird referencing behavior encounter during assembly uninstallation on Vista remain, as of yet, unresolved issues. If anyone is game for figuring those out, I’d be glad to hear about it.

Associating LPC clients and servers

April 12, 2008 by Koby Kahane

Local Procedure Call (LPC) is a fast, lightweight interprocess communication mechanism used extensively by Windows system components. Microsoft’s documented approach for using LPC is through the higher-level RPC (Remote Procedure Call) API and its “ncalrpc” transport, which wraps the lower-level LPC interfaces. However, many Windows components use the LPC functions from the Native API directly and not through ncalrpc.

Like other remoting architectures, LPC presents a diagnostic challenge. We locate the remoting client thread and expect its flow to contain whatever information we are looking for, only to find it sending a message to the other end, magically receiving a response and continuing its merry way. In order to get a complete view of the complex interactions in the system, the task at hand is locating the server thread and examining its flow until we have the answer we are looking for.

As a case study, we shall consider an unlucky developer that must debug a custom security support provider or authentication package (SSP or AP) that runs in the context of the Windows Local Security Authority (LSA) subsystem. The developer wishes to single-step through the creation of a logon session. This sequence starts at the Winlogon process when it calls the LsaLogonUser API to authenticate a user’s credentials. The following treatise discusses Windows XP and traditional LPC. Vista features asynchronous LPC (ALPC) and is beyond the scope of this post.

We attach the trusty Windbg to the Winlogon process and set up a breakpoint on the API. We resume execution and use the Welcome screen GUI to initiate the actual logon attempt. For convenience, this would best be done with a remote debugging setup to a virtual machine. Once execution is suspended at the API of interest, we perform a trace to get an initial idea of the flow:

0:019> bp secur32!LsaLogonUser
0:019> g
ModLoad: 74980000 74a93000 C:\WINDOWS\system32\msxml3.dll
Breakpoint 0 hit
eax=0006ec78 ebx=00eedfb8 ecx=00000000 edx=00eedfdc esi=759799c4 edi=00eedfe4
eip=77fe33e8 esp=0006ec08 ebp=0006ecb8 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
Secur32!LsaLogonUser:
77fe33e8 8bff mov edi,edi
0:000> wt
Tracing Secur32!LsaLogonUser to return address 77dec60d
43 0 [ 0] Secur32!LsaLogonUser
1 0 [ 1] ntdll!NtRequestWaitReplyPort
2 0 [ 1] ntdll!ZwRequestWaitReplyPort
2 0 [ 2] ntdll!KiFastSystemCall
1 0 [ 1] ntdll!ZwRequestWaitReplyPort
64 6 [ 0] Secur32!LsaLogonUser


70 instructions were executed in 69 events (0 from other threads)

Function Name Invocations MinInst MaxInst AvgInst
Secur32!LsaLogonUser 1 64 64 64
ntdll!KiFastSystemCall 1 2 2 2
ntdll!NtRequestWaitReplyPort 1 1 1 1
ntdll!ZwRequestWaitReplyPort 2 1 2 1

1 system call was executed

Calls System Call
1 ntdll!KiFastSystemCall

eax=c000006d ebx=00eedfb8 ecx=0006eda8 edx=00100004 esi=759799c4 edi=00eedfe4
eip=77dec60d esp=0006ec44 ebp=0006ecb8 iopl=0 nv up ei ng nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000282
ADVAPI32!L32pLogonUser+0×2cf:
77dec60d 8bf0 mov esi,eax

The debugging developer is disappointed the API simply invokes the NtRequestWaitReplyPort native API, which is used by the client of an LPC port to send a request message and wait synchronously for the server response. The other end is probably the LSA subsystem process, lsass.exe, intuition predicts. First, let’s verify this intuitive guess.

We’ll break again at an invocation of LsaLogonUser in the Winlogon process and this time proceed to the invocation of the LPC port and examine its arguments:


:000> g
ModLoad: 74980000 74a93000 C:\WINDOWS\system32\msxml3.dll
ModLoad: 74980000 74a93000 C:\WINDOWS\system32\msxml3.dll
Breakpoint 0 hit
eax=0006ec78 ebx=00eedfb8 ecx=00000000 edx=00eedfdc esi=759799c4 edi=00eedfe4
eip=77fe33e8 esp=0006ec08 ebp=0006ecb8 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
Secur32!LsaLogonUser:
77fe33e8 8bff mov edi,edi
0:000> bp ntdll!NtRequestWaitReplyPort
0:000> g
Breakpoint 1 hit
eax=0006eb5c ebx=00eedfb8 ecx=000e000d edx=00eedfdc esi=00000000 edi=0006ebac
eip=7c90e3e1 esp=0006eb44 ebp=0006ec04 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!NtRequestWaitReplyPort:
7c90e3e1 b8c8000000 mov eax,0C8h
0:000> kb 2
ChildEBP RetAddr Args to Child
0006eb40 77fe347e 000006b4 0006eb5c 0006eb5c ntdll!NtRequestWaitReplyPort
0006ec04 77dec60d 000006b4 0006ec78 00000002 Secur32!LsaLogonUser+0xa0

We see that the prototype of NtRequestWaitReplyPort places the LPC port handle as the first argument of the API. We verify this assertion:
0:000> !handle 6b4
Handle 6b4
Type Port

Great. Now what? An LPC port is, in fact, one of the many kinds of NT kernel objects. If we wish to gain access to more information about it, we must access the kernel’s knowledge of the object. This is accomplished through the use of the kernel debugger, kd. Windbg can also act as a GUI wrapper for kernel debugging sessions. As we shall soon find out, the kernel debugger is quite useful even when you have nothing to do with drivers or other kernel-mode code.

Attaching a kernel debugger is covered extensively in the literature and is left as an exercise for the reader. Suffice to say you are much better off using a virtual machine and a named pipe over using a serial or FireWire connection to a physical system, unless some exotic circumstances force you to use an actual machine (i.e., you are debugging authentication of a fingerprint reader product you are developing, etc. and need the physical device to be present in the debugged system)

Handles, except system handles, are process-local. What we want to do is associate the Winlogon port handle with the kernel LPC port object. First, we’ll find the Winlogon process and set the kernel debugger’s context to that process:

0: kd> !process 0 0 winlogon.exe
PROCESS 82bcdda0 SessionId: 0 Cid: 026c Peb: 7ffdd000 ParentCid: 0220
DirBase: 0a3e0000 ObjectTable: e14f1a48 HandleCount: 534.
Image: winlogon.exe

0: kd> .process 82bcdda0
Implicit process is now 82bcdda0
WARNING: .cache forcedecodeuser is not enabled

Good, now lets have a look at the handle, this time from the kernel debugger’s point of view. Although the command has the same name, the kernel debugger command is distinct.

0: kd> !handle 6b4
processor number 0, process 82bcdda0
PROCESS 82bcdda0 SessionId: 0 Cid: 026c Peb: 7ffdd000 ParentCid: 0220
DirBase: 0a3e0000 ObjectTable: e14f1a48 HandleCount: 534.
Image: winlogon.exe

Handle table at e1ba8000 with 534 Entries in use
06b4: Object: e1582f68 GrantedAccess: 001f0001 Entry: e169cd68
Object: e1582f68 Type: (82bed5d0) Port
ObjectHeader: e1582f50 (old version)
HandleCount: 1 PointerCount: 1

We see the LPC port kernel object is at kernel-mode address 0xe1582f68. Each variety of kernel object might have a distinct debugger extension for further examination. In the LPC port case, we shall use the !lpc port command to examine the port:

0: kd> !lpc port e1582f68

Client communication port 0xe1582f68
Handles: 1 References: 1
The LpcDataInfoChainHead queue is empty
Connected port: 0xe15765a0 Server connection port: 0xe17426c8

Server communication port 0xe15765a0
Handles: 1 References: 1
The LpcDataInfoChainHead queue is empty

Server connection port e17426c8 Name: LsaAuthenticationPort
Handles: 1 References: 52
Server process : 8290e4b8 (lsass.exe)
Queue semaphore : 82a0a3e8
Semaphore state 0 (0×0)
The message queue is empty
The LpcDataInfoChainHead queue is empty

Our suspicion that lsass.exe is the server for this LPC port is confirmed. Furthermore, we now know that the LPC port is called LsaAuthenticationPort by this client/server pair. If LSASS has just one port server thread, we’re done. Let us examine it to see if we are that lucky. A word of caution to the uninitiated: debugging LSASS is a messy deal and you can easily deadlock your system by initiating operations that interact with LSA and expect it to be active. For example, do not use a symbol server based on an SMB share, since the network session setup will block while LSA is suspended. For our purposes, a non-invasive, non-suspending session, using the NTSD debugger’s “-pvr” command line switch, would be best for the initial investigation. Once started, we use the debugger to locate stack frames with LPC servers, by looking for the NtReplyWaitReceivePort native API on the stack:

0:000> !findstack ntdll!NtReplyWaitReceivePort
Thread 004, 1 frame(s) match
* 01 008dfe88 7575ba56 ntdll!NtReplyWaitReceivePort+0xc

Thread 006, 1 frame(s) match
* 01 00acfe18 77e765d3 ntdll!NtReplyWaitReceivePortEx+0xc

Thread 007, 1 frame(s) match
* 01 00b8fe40 75738f5c ntdll!NtReplyWaitReceivePort+0xc

Thread 010, 1 frame(s) match
* 01 00e5fe18 77e765d3 ntdll!NtReplyWaitReceivePortEx+0xc

Thread 016, 1 frame(s) match
* 01 0007fe18 77e765d3 ntdll!NtReplyWaitReceivePortEx+0xc

Thread 017, 1 frame(s) match
* 01 00b0fe18 77e765d3 ntdll!NtReplyWaitReceivePortEx+0xc

Thread 019, 1 frame(s) match
* 01 00c8fe18 77e765d3 ntdll!NtReplyWaitReceivePortEx+0xc

GetContextState failed, 0×8007001F
GetContextState failed, 0×8007001F
GetContextState failed, 0×8007001F
Unable to get program counter
GetContextState failed, 0×8007001F
Unable to get current machine context, Win32 error 0n31
GetContextState failed, 0×8007001F
GetContextState failed, 0×8007001F
GetContextState failed, 0×8007001F
GetContextState failed, 0×8007001F
Unable to get current machine context, Win32 error 0n31
ERROR: !findstack: extension exception 0×8007001f.
“Unable to get thread stack”
0:000>

We can ignore the spurious errors from !findstack on the bottom of the output, which are the result of our non-invasive attach. Unfortunately for us, the plot thickens as lsass.exe has a bunch of server threads and it is as of yet unclear which one will have the server-side flow of interest. How can figure out which of the seven potential servers is the right one?

We’ll get started by extracting the server port handle from each thread’s stack. As in the case of the client function, NtReplyWaitReceivePort takes the port handle as its first argument.

0:000> ~4 kb 2
ChildEBP RetAddr Args to Child
008dfe84 7c90e384 7575ba56 000000d0 00000000 ntdll!KiFastSystemCallRet
008dfe88 7575ba56 000000d0 00000000 00000000 ntdll!NtReplyWaitReceivePort+0xc
0:000> ~6 kb 2
ChildEBP RetAddr Args to Child
00acfe14 7c90e399 77e765d3 000002e8 00acff74 ntdll!KiFastSystemCallRet
00acfe18 77e765d3 000002e8 00acff74 00000000 ntdll!NtReplyWaitReceivePortEx+0xc
0:000> ~7 kb 2
ChildEBP RetAddr Args to Child
00b8fe3c 7c90e384 75738f5c 000002cc 00b8fe6c ntdll!KiFastSystemCallRet
00b8fe40 75738f5c 000002cc 00b8fe6c 00000000 ntdll!NtReplyWaitReceivePort+0xc
0:000> ~10 kb 2
ChildEBP RetAddr Args to Child
00e5fe14 7c90e399 77e765d3 000002b4 00e5ff74 ntdll!KiFastSystemCallRet
00e5fe18 77e765d3 000002b4 00e5ff74 00000000 ntdll!NtReplyWaitReceivePortEx+0xc
0:000> ~16 kb 2
GetContextState failed, 0×8007001F
Unable to get current machine context, Win32 error 0n31
^ Illegal thread error in ‘~16 kb 2′
0:000> ~17 kb 2
GetContextState failed, 0×8007001F
Unable to get current machine context, Win32 error 0n31
^ Illegal thread error in ‘~17 kb 2′
0:000> ~19 kb 2
ChildEBP RetAddr Args to Child
00c8fe14 7c90e399 77e765d3 000002a8 00c8ff74 ntdll!KiFastSystemCallRet
00c8fe18 77e765d3 000002a8 00c8ff74 00000000 ntdll!NtReplyWaitReceivePortEx+0xc

The non-invasive session has trouble accessing the context of threads 16 and 17. If we don’t find the server handle we are looking for elsewhere, we could retry with an invasive attach. For now, let’s note the server port handles appearing in the stack traces we do have.

Knowing which thread is associated with which port handle, we turn to the kernel debugger for associating the server port object with its handle in the lsass.exe process. Recall that we have the server port object’s address from the output of the !lpc port command on the client port object. We list lsass.exe’s port handles and look for the one associated with it:

0: kd> !process 0 0 lsass.exe
PROCESS 8290e4b8 SessionId: 0 Cid: 02b0 Peb: 7ffdf000 ParentCid: 026c
DirBase: 0a337000 ObjectTable: e1730338 HandleCount: 357.
Image: lsass.exe
0: kd> !handle 0 0 8290e4b8 Port
processor number 0, process 8290e4b8
Searching for handles of type Port
PROCESS 8290e4b8 SessionId: 0 Cid: 02b0 Peb: 7ffdf000 ParentCid: 026c
DirBase: 0a337000 ObjectTable: e1730338 HandleCount: 357.
Image: lsass.exe
Handle table at e173b000 with 357 Entries in use
0018: Object: e1736c58 GrantedAccess: 001f0001 (Protected)
00c4: Object: e172a578 GrantedAccess: 001f0001
00cc: Object: e173d030 GrantedAccess: 001f0001
00d0: Object: e173ff68 GrantedAccess: 001f0001
01c8: Object: e17ba3b8 GrantedAccess: 001f0001
02a8: Object: e1754bc8 GrantedAccess: 001f0001
02b4: Object: e17427a0 GrantedAccess: 001f0001
02b8: Object: e1b6c318 GrantedAccess: 001f0001
02c8: Object: e173d240 GrantedAccess: 001f0001
02cc: Object: e17426c8 GrantedAccess: 001f0001
02e8: Object: e173f410 GrantedAccess: 001f0001
030c: Object: e1742470 GrantedAccess: 001f0001
0314: Object: e1742548 GrantedAccess: 001f0001
033c: Object: e1744e58 GrantedAccess: 001f0001
03a4: Object: e17a4e00 GrantedAccess: 001f0001
03b0: Object: e17a4cc0 GrantedAccess: 001f0001
03c0: Object: e17a9640 GrantedAccess: 001f0001
03c8: Object: e17f29f8 GrantedAccess: 001f0001
03cc: Object: e17deb48 GrantedAccess: 001f0001
03d4: Object: e17f53d0 GrantedAccess: 001f0001
03dc: Object: e17f7dd8 GrantedAccess: 001f0001
03e4: Object: e17f7c10 GrantedAccess: 001f0001
03f4: Object: e190abf0 GrantedAccess: 001f0001
03f8: Object: e17fb690 GrantedAccess: 001f0001
0404: Object: e1b43310 GrantedAccess: 001f0001
0418: Object: e19149b8 GrantedAccess: 001f0001
0440: Object: e1942030 GrantedAccess: 001f0001
0454: Object: e19184a0 GrantedAccess: 001f0001
0458: Object: e1b3a770 GrantedAccess: 001f0001
0464: Object: e1758170 GrantedAccess: 001f0001
0480: Object: e1601ab8 GrantedAccess: 001f0001
049c: Object: e1cf59d8 GrantedAccess: 001f0001
04a4: Object: e13082a8 GrantedAccess: 001f0001
04f4: Object: e1366490 GrantedAccess: 001f0001
0508: Object: e1b466c0 GrantedAccess: 001f0001
0514: Object: e1c9f450 GrantedAccess: 001f0001
0564: Object: e1b5a640 GrantedAccess: 001f0001
0580: Object: e15765a0 GrantedAccess: 001f0001
05cc: Object: e1d814e0 GrantedAccess: 001f0001

Woah, that’s quite a list. We note the handle pointing to the server port object is 0×2cc. Cross-referencing with the handle stacks we extracted from the user-mode debugger, we hit the gold and figure out thread 7 is the thread we are looking for:


0:000> ~7 k
ChildEBP RetAddr
00b8fe3c 7c90e384 ntdll!KiFastSystemCallRet
00b8fe40 75738f5c ntdll!NtReplyWaitReceivePort+0xc
00b8ff74 75738d66 LSASRV!LpcServerThread+0xaf
00b8ffb4 7c80b6a3 LSASRV!LsapThreadBase+0×91
00b8ffec 00000000 kernel32!BaseThreadStart+0×37

Yup, that looks about right. We now know we should set our breakpoint at this thread, as it is expected to return from the blocking NtReplyWaitReceivePort system call upon an incoming LPC message on LsaAuthenticationPort. At that point, we can follow execution as the LSA server DLL assigns handling the call to a worker thread or whatever else it may be up to.

Once again, kd saves the day.

Microsoft publishes dozens of its network protocol specifications on MSDN

February 21, 2008 by Koby Kahane

Microsoft made a big announcement today about having a new policy of promoting interoperability with its major products, citing modern needs, etc. If you ask me, the need for interoperability today is not much greater than it was a few years ago and this policy shift is way overdue. Along with the announcement which made for an amusing assortment of corporate-speak, Microsoft made the operative move of immediately publishing dozens of network protocol specifications on the MSDN Library. Their index can be found here. Apparently, documentation for things other than protocols (i.e., APIs) is forthcoming.

Having spent a few minutes going over some of these specifications, I have several observations to make:

  • Many of these specifications have been updated multiple times during the past year or so. Unlike Microsoft’s forgotten Internet Draft for the DCOM protocol from the late 1990s, finally we see up-to-date specifications for a change. I hope with the wide availability at a high-profile location like the MSDN Library, these contemporary specs will keep getting the love they need and could be relied upon to reflect the current Microsoft implementations.
  • Nearly every network service included with Microsoft Windows appears to be documented.
  • The detailed specifications are a gold mine to anyone seeking an under-the-hood glimpse of the internals of Microsoft’s network services. I was particularly thrilled, as can be expected, to encounter up to date descriptions of the extensions Microsoft made to the DCE RPC protocol, the DCOM network protocol and even how COM+ (MSDTC) implements network transactions over the prior.
  • The specifications are coherent in the sense that each makes appropriate references to related protocols. e.g., the COM+ specification references the DCOM specification, which references the RPC extensions specification. Even third-party references are made, e.g. to the “Open” Group’s DCE RPC 1.1 specification. (Open in quotes since, ironically, while I could readily download a protocol specification PDF from Microsoft’s MSDN with no intrusion, the so-called “Open” Group required compulsory registration for the free download, which seems to have nothing in for me except the prospect of future spam…)
  • I did not tolerate enough of the corporate speak in the press release to understand the legal status of the document release, but I hope it such that will allow popular open source diagnostic tools such as Wireshark to provide detailed, complete and accurate diagnostic information about these protocols.
  • The specifications tend to read more “official” than “practical.” In other words, they are more like an ISO standard than an IETF RFC. There’s hardly introductory text describing the protocols in context but rather really long glossaries you have to skim over to “get to the good stuff.” While raw technical descriptions are important, one has to question Microsoft’s true commitment to the promotion of interoperability given this state of affairs. Perhaps with their now altered target audience, we shall see improvements in this department in the not so distant future?
  • Some network protocols (Exchange, SQL Server) are not yet available, but are scheduled to be released sooner rather than later. In particular, I consider the publishing of the Exchange protocols as crucial to the promotion of interoperability in the groupware realm.

So what’s your favorite Microsoft network protocol? :-)

Breaking when a function returns a specific value without depending on its call site

October 29, 2007 by Koby Kahane

Jigar Mehta writes today about how to set a breakpoint in the debugger on a function and check its return value using conditionals, in another post in his fine blog, that can teach even a seasoned Windbg user a thing or two (personally I didn’t know of the “-psn” switch that allows easy attachment to a service process until reading about there).

However, Jigar’s approach requires to determine the call site of the interesting function and breaking after its return for examining the return value. This may be tolerable if you are only interested in what the function returns when invoked from a specific location, but for more general scenarios can be insufficient.

An alternative approach is to use the “gu” debugger command (go up to after the call site) and examine the return value register and potentially other output parameters at that state. For example:

0:000> bp kernel32!CreateFileW "gu; j @eax == -1 '.echo CreateFileW failed; gc' '.printf \"CreateFileW returned %p\", @$retreg; .echo; gc'"
0:000> g
ModLoad: 5cb70000 5cb96000 C:\WINDOWS\system32\ShimEng.dll

ModLoad: 74d90000 74dfb000 C:\WINDOWS\system32\USP10.dll
CreateFileW returned 000007c8
ModLoad: 74720000 7476b000 C:\WINDOWS\system32\MSCTF.dll
CreateFileW returned 00000774
CreateFileW returned 00000774
ModLoad: 755c0000 755ee000 C:\WINDOWS\system32\msctfime.ime
ModLoad: 605d0000 605d9000 C:\WINDOWS\system32\mslbui.dll

I get bonus points for using the portable @$retreg over @eax. I lose some for not figuring out how the heck to get .printf to write a newline without a trailing .echo (both \n and \r\n don’t seem to do the trick here). Of course if what you have in mind is to actually break at the breakpoint, do not append “gc” (go from conditional breakpoint) at the end of the branch of interest.

The case of the Windows Desktop Search CPU hog

September 28, 2007 by Koby Kahane

My home computer is set up (using MBM 5) to sound off an alarm with the PC Speaker whenever CPU temperature exceeds 70 degrees Celsius. It is not that I overclock it, not at all, but rather because of the embarrassingly poor fan I got with the box (which I didn’t purchase personally) combined with the Pentium 4’s well-known poor thermal characteristics.

The alarm is quite audible and annoying. Fortunately, it is pretty unusual for the CPU to get that hot. I’ve noticed that it does this after more than a few minutes of running at near 100% CPU usage.

Earlier today, I could hear the annoying alarm from across the house. I sat down at the computer and quickly silenced it. Since this isn’t the first time I’ve heard the alarm, I already knew what I was looking for: a CPU hog. More often than not, the culprit is vmware-vmx.exe, VMware Workstation’s virtual machine process. If you run an MS-DOS VM without DOSIDLE or the like, the guest OS hogs the CPU and the blame is placed on vmware-vmx.exe. This can also happen with a modern guest OS like Windows in case the guest is misbehaving for some reason. However, this time, I didn’t remember leaving any VM running on the machine, and indeed, none was to be found.

I opened the Task Manager to examine the situation and after a few moments determined that cidaemon.exe was at 49% CPU usage. This actually means 100% on my system which has a hyper-threaded, single core Pentium 4.

My first reaction was that it’s probably just a random case of a CPU hog since I’ve never experienced it on my system before. Without asking too many questions, I killed the cidaemon.exe process and watched as CPU utilization plummeted to the healthy single-digits.

I got up from the computer and addressed other concerns (washed my car). It was about an hour later that I came back and heard the hectic CPU temperature alarm once again. I took another look at the process list and found out cidaemon.exe has been resurrected, and had returned to avenge its predecessor’s demise with another round of CPU hogging.

At this point I understood there is no recourse but to conduct a deeper investigation. I googled for cidaemon.exe being a CPU hog and pretty much only found the “solution” of disabling Windows Desktop Search. Yeah, well, not so fast. I fired up the debugger and attached to the CPU hogging cidaemon.exe process (there was another instance of cidaemon.exe which was being benign).

The first thing I did is get some basic information on where I found myself:

0:006> lmv m cidaemon
start end module name
01000000 01005000 cidaemon (pdb symbols) C:\WINDOWS\Symbols\exe\cidaemon.pdb
Loaded symbol image file: C:\WINDOWS\SYSTEM32\cidaemon.exe
Image path: C:\WINDOWS\SYSTEM32\cidaemon.exe
Image name: cidaemon.exe
Timestamp: Fri Aug 17 23:56:12 2001 (3B7D84EC)
CheckSum: 00007613
ImageSize: 00005000
File version: 5.1.2600.0
Product version: 5.1.2600.0
File flags: 0 (Mask 3F)
File OS: 40004 NT Win32
File type: 1.0 App
File date: 00000000.00000000
Translations: 0409.04b0
CompanyName: Microsoft Corporation
ProductName: Microsoft® Windows® Operating System
InternalName: cidaemon.exe
OriginalFilename: cidaemon.exe
ProductVersion: 5.1.2600.0
FileVersion: 5.1.2600.0 (xpclient.010817-114 8)
FileDescription: Indexing Service filter daemon
LegalCopyright: © Microsoft Corporation. All rights reserved.

OK, so cidaemon.exe is the indexing service’s filter daemon. An educated guess would be that the name is actually short for “content indexing daemon.” I knew that the indexing service is a part of the core Windows XP operating system (as is apparent from the file version of cidaemon.exe above) and not an additional service installed by the separate Windows Desktop Search package I have on my system.

The next thing I did was to see how many threads I am dealing with in this demonic daemon process:

0:003> ~
0 Id: afc.af4 Suspend: 1 Teb: 7ffdf000 Unfrozen
1 Id: afc.a5c Suspend: 1 Teb: 7ffde000 Unfrozen
2 Id: afc.858 Suspend: 1 Teb: 7ffdc000 Unfrozen
. 3 Id: afc.13b4 Suspend: 1 Teb: 7ffdb000 Unfrozen
4 Id: afc.1190 Suspend: 1 Teb: 7ffda000 Unfrozen
6 Id: afc.10d8 Suspend: 1 Teb: 7ffd8000 Unfrozen

OK, looks like I’ve got quite a few. Only the CPU hogging thread is of interest, so let’s check the CPU times, let the process resume for a while, and examine the times once again:

0:003> !runaway
User Mode Time
Thread Time
6:10d8 0 days 0:00:53.515
0:af4 0 days 0:00:00.140
4:1190 0 days 0:00:00.078
3:13b4 0 days 0:00:00.000
2:858 0 days 0:00:00.000
1:a5c 0 days 0:00:00.000
0:003> g
(afc.ca8): Break instruction exception - code 80000003 (first chance)
eax=7ffdd000 ebx=00000001 ecx=00000002 edx=00000003 esi=00000004 edi=00000005
eip=7c901230 esp=0078ffcc ebp=0078fff4 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000246
ntdll!DbgBreakPoint:
7c901230 cc int 3
0:005> !runaway
User Mode Time
Thread Time
6:10d8 0 days 0:01:04.859
0:af4 0 days 0:00:00.140
4:1190 0 days 0:00:00.078
5:ca8 0 days 0:00:00.000
3:14b8 0 days 0:00:00.000
2:858 0 days 0:00:00.000
1:a5c 0 days 0:00:00.000

Well, no hard choice here. The other threads list the same CPU time and thread 6 is obviously the culprit. I examine the thread’s stack trace:

0:005> ~6 s
eax=00000103 ebx=00000aef ecx=00000103 edx=000043c3 esi=000043c0 edi=000000ef
eip=74da9a58 esp=00bdc2ac ebp=00bdc370 iopl=0 nv up ei ng nz ac po cy
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000293
USP10!ScriptTokenize+0×97:
74da9a58 663b4508 cmp ax,word ptr [ebp+8] ss:0023:00bdc378=0103
0:006> k
ChildEBP RetAddr
00bdc370 74da2d44 USP10!ScriptTokenize+0×97
00bdc398 74e85840 USP10!ScriptItemize+0×42
00bdc3d0 74e84a13 riched20!CUniscribe::ItemizeString+0×8a
00bdc644 74e5f269 riched20!CTxtBreaker::OnPostReplaceRange+0×200
00bdc664 74e6b436 riched20!CNotifyMgr::NotifyPostReplaceRange+0×25
00bdc6ec 74e65825 riched20!CRchTxtPtr::ReplaceRange+0×42c
00bdc750 74e65643 riched20!CTxtRange::ReplaceRange+0×127
00bdc7b8 74e66b3e riched20!CTxtRange::CheckLimitReplaceRange+0×179
00bdca04 74e5b32b riched20!CTxtRange::CleanseAndReplaceRange+0×833
00bdfa78 74e5b8b2 riched20!CLightDTEngine::ReadPlainText+0×2c7
00bdfd30 74e7ce5d riched20!CLightDTEngine::LoadFromEs+0×287
00bdfeb8 74e7e87b riched20!CTxtEdit::TxSendMessage+0×11c9
00bdff1c 00852b90 riched20!CTxtEdit::Open+0×244
WARNING: Stack unwind information not available. Following frames may be wrong.
00bdff44 00853e4e rtffilt!DllCanUnloadNow+0xaff
00bdff74 00853eb8 rtffilt!DllRegisterServer+0×1af
00bdffac 00853f11 rtffilt!DllRegisterServer+0×219
00bdffb4 7c80b683 rtffilt!DllRegisterServer+0×272
00bdffec 00000000 kernel32!BaseThreadStart+0×37

I was quite surprised and disappointed by this stack trace. As is apparent, no symbolic information is available for rtffilt.dll. Since my Windbg is configured to use the Microsoft web symbol store, the PDB should have been downloaded automatically. I pondered, perhaps a third party product has installed this filter into the indexing service? I examined who is to blame for this sorry state of affairs:

0:006> lmv m rtffilt
start end module name
00850000 0085a000 rtffilt (export symbols) C:\WINDOWS\system32\rtffilt.dll
Loaded symbol image file: C:\WINDOWS\system32\rtffilt.dll
Image path: C:\WINDOWS\system32\rtffilt.dll
Image name: rtffilt.dll
Timestamp: Tue Feb 06 01:36:06 2007 (45C7BF66)
CheckSum: 00014877
ImageSize: 0000A000
File version: 2006.0.6000.16431
Product version: 2006.0.6000.16431
File flags: 8 (Mask 3F) Private
File OS: 40004 NT Win32
File type: 2.0 Dll
File date: 00000000.00000000
Translations: 0409.04b0
CompanyName: Microsoft Corporation
ProductName: Microsoft® Windows® Operating System
InternalName: rtffilt.lib
OriginalFilename: rtffilt.lib
ProductVersion: 2006.0.6000.16431
FileVersion: 2006.0.6000.16431 (vista_gdr(wmbla).070205-1422)
FileDescription: RTF Filter
LegalCopyright: © Microsoft Corporation. All rights reserved.

Oy vey. The RTF filter is actually a Microsoft DLL. Since this is an XP machine and the DLL version is “Vista-ish”, I quickly concluded that the DLL belongs to Windows Desktop Search 3.01, which aligns the desktop search engine on XP with Vista’s built-in search engine. The only other thing with “Vista-ish” versions on my system is Windows Script Host 5.7, which similarly, aligns WSH functionality with the Vista version. I figured the RTF filter is unrelated to that product.

It appears that whenever Microsoft releases an add-on for Windows like Windows Desktop Search, releasing debugging symbols to the symbol store is not a foregone conclusion, even if the original version of the component has its symbols already published by virtue of being a part of the Windows Vista operating system. It is obvious that the absence of the symbols is not a misguided attempt at protecting intellectual property, since they are available anyway for their Vista counterparts. For some reason, not all groups inside Microsoft have uploading public symbols to the symbol store as an integral part of their release process. When mishaps like this happen, their customers are left out in the cold to diagnose the issue on their own. Sure, in this case it was on my home computer, but it could have easily been an enterprise production debugging scenario. Missing symbols are unacceptable, Microsoft. Someone should get their ass fired.

Although taken aback, I did not give up. Since I did have symbols for the upper part of the stack trace, I could still have some impression of what was going on. I figured I’d let the filter run for a while and examine the stack again:

0:006> g
(afc.fbc): Break instruction exception - code 80000003 (first chance)
eax=7ffdd000 ebx=00000001 ecx=00000002 edx=00000003 esi=00000004 edi=00000005
eip=7c901230 esp=0078ffcc ebp=0078fff4 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000246
ntdll!DbgBreakPoint:
7c901230 cc int 3
0:005> ~6 k
ChildEBP RetAddr
00bdc3e8 74e848d6 riched20!CBreakArray::GetBreak+0×2a
00bdc644 74e5f269 riched20!CTxtBreaker::OnPostReplaceRange+0xc3
00bdc664 74e6b436 riched20!CNotifyMgr::NotifyPostReplaceRange+0×25
00bdc6ec 74e65825 riched20!CRchTxtPtr::ReplaceRange+0×42c
00bdc750 74e65643 riched20!CTxtRange::ReplaceRange+0×127
00bdc7b8 74e66b3e riched20!CTxtRange::CheckLimitReplaceRange+0×179
00bdca04 74e5b32b riched20!CTxtRange::CleanseAndReplaceRange+0×833
00bdfa78 74e5b8b2 riched20!CLightDTEngine::ReadPlainText+0×2c7
00bdfd30 74e7ce5d riched20!CLightDTEngine::LoadFromEs+0×287
00bdfeb8 74e7e87b riched20!CTxtEdit::TxSendMessage+0×11c9
00bdff1c 00852b90 riched20!CTxtEdit::Open+0×244
WARNING: Stack unwind information not available. Following frames may be wrong.
00bdff44 00853e4e rtffilt!DllCanUnloadNow+0xaff
00bdff74 00853eb8 rtffilt!DllRegisterServer+0×1af
00bdffac 00853f11 rtffilt!DllRegisterServer+0×219
00bdffb4 7c80b683 rtffilt!DllRegisterServer+0×272
00bdffec 00000000 kernel32!BaseThreadStart+0×37

So it looks like we went from stack bottom doing some Uniscribe work up to actual work on the RTF by the implementation in the Rich Edit library. We do not appear to be stuck in an infinite loop in Uniscribe, at least. Letting the daemon run a few more times shows that some movement between Rich Edit 2.0 functions is occurring, but it is unclear what’s making it work so hard.

A corrupt RTF file is an obvious suspect as the cause of the problem. Let’s see what the filter is working on. I was hoping to use the “!handle” extension to find out about the file handle used by the cidaemon process, but disappointingly, I could not get the extension to show the file name for a given file handle. I opted for Sysinternals’ Handle tool, which I remembered had this functionality. I reviewed cidaemon.exe’s open handles:

C:\Program Files\SysInternals>handle -p cidaemon.exe
Handle v3.2
Copyright (C) 1997-2006 Mark Russinovich
Sysinternals - www.sysinternals.com
——————————————————————————
cidaemon.exe pid: 5016 NT AUTHORITY\SYSTEM
C: File (RW-) C:\WINDOWS\system32
40: File (RW-) C:\WINDOWS\WinSxS\x86_Microsoft.Windows.Common-Controls_659
5b64144ccf1df_6.0.2600.2982_x-ww_ac3f9c03
68: Section \BaseNamedObjects\c::inetpub:catalog.wci__cisharedmem
F8: Section \BaseNamedObjects\__CiPerfMonMemory
——————————————————————————
cidaemon.exe pid: 2812 NT AUTHORITY\SYSTEM
C: File (RW-) C:\WINDOWS\system32
40: File (RW-) C:\WINDOWS\WinSxS\x86_Microsoft.Windows.Common-Controls_659
5b64144ccf1df_6.0.2600.2982_x-ww_ac3f9c03
68: Section \BaseNamedObjects\c::system volume information:catalog.wci_
_cisharedmem
F8: Section \BaseNamedObjects\__CiPerfMonMemory
14C: File (—) C:\Program Files\Nero\Nero 7\Core\CDI\CDI_IMAG.RTF
1D4: Section \BaseNamedObjects\ShimSharedMemory
20C: File (RW-) C:\WINDOWS\WinSxS\x86_Microsoft.Windows.GdiPlus_6595b64144c
cf1df_1.0.2600.2180_x-ww_522f9f82
264: File (RW-) C:\WINDOWS\WinSxS\x86_Microsoft.Windows.Common-Controls_659
5b64144ccf1df_6.0.2600.2982_x-ww_ac3f9c03
2A0: File (RW-) C:\Program Files\Nero\Nero 7\Core\CDI\CDI_IMAG.RTF

The offending cidaemon.exe has two open handles to an RTF file and in fact they are both for the same file. This file, CDI_IMAG.RTF, is a part of the Nero Burning ROM CD/DVD software installed on my system. This must be the file causing the RTF filter’s grief.

Attempting to confirm this is the case (although hardly anything would be able to persuade me otherwise at this point), I examine the thread’s stack yet again, this time also examining arguments:

0:006> kb
ChildEBP RetAddr Args to Child
00bdc3a8 74e795bb 00012414 00000100 00000000 riched20!CTxtPtr::FindOrSkipWhiteSpaces+0xb5
00bdc3dc 74e84889 00000001 00000101 00000100 riched20!CTxtPtr::FindWhiteSpaceBound+0xcc
00bdc644 74e5f269 00012414 00000000 00000001 riched20!CTxtBreaker::OnPostReplaceRange+0×76
00bdc664 74e6b436 00bdfe50 00012414 00000000 riched20!CNotifyMgr::NotifyPostReplaceRange+0×25
00bdc6ec 74e65825 00000001 00000001 00bdd2e8 riched20!CRchTxtPtr::ReplaceRange+0×42c
00bdc750 74e65643 00000001 00bdd2e8 00000000 riched20!CTxtRange::ReplaceRange+0×127
00bdc7b8 74e66b3e 00000001 00bdd2e8 000000ee riched20!CTxtRange::CheckLimitReplaceRange+0×179
00bdca04 74e5b32b 000003c6 00bdca78 00000000 riched20!CTxtRange::CleanseAndReplaceRange+0×833
00bdfa78 74e5b8b2 00bdfe4c 00bdfee4 00000000 riched20!CLightDTEngine::ReadPlainText+0×2c7
00bdfd30 74e7ce5d 00bdfe4c 00001011 00bdfee4 riched20!CLightDTEngine::LoadFromEs+0×287
00bdfeb8 74e7e87b 00000449 00001011 00bdfee4 riched20!CTxtEdit::TxSendMessage+0×11c9
00bdff1c 00852b90 000ba5b0 00bdff34 00000130 riched20!CTxtEdit::Open+0×244
WARNING: Stack unwind information not available. Following frames may be wrong.
00bdff44 00853e4e 00bdff64 b7ee486a 7ffdd000 rtffilt!DllCanUnloadNow+0xaff
00bdff74 00853eb8 b7ee48b2 7ffdd000 7ffdf000 rtffilt!DllRegisterServer+0×1af
00bdffac 00853f11 00bdffec 7c80b683 0007e384 rtffilt!DllRegisterServer+0×219
00bdffb4 7c80b683 0007e384 7ffdd000 7ffdf000 rtffilt!DllRegisterServer+0×272
00bdffec 00000000 00853f04 0007e384 00000000 kernel32!BaseThreadStart+0×37

Neither file handle 0×14c nor file handle 0×2a0 is immediately visible in this stack trace. Presumably the Rich Edit 2.0 library is dealing with the RTF stream in memory. Once again I am bitten by the absence of the RTF filter’s symbol and FPO information. If the stack trace had not been black-holed in the middle, perhaps associating the ongoing processing with an handle would have been easier. I realize in this case only one RTF file is open in the offending process, but I was still interested in how one could associate this operation with a specific file if the filter had been written to process many RTF files in bulk, simultaneously.

While we can’t readily see the handle’s association with the current Rich Edit processing, we can at least find old, nested function calls on the stack that used it. We can search the stack for the 0×2a0 handle, for example (little endian byte order):

0:006> s @esp L1000 a0 02
00bdc994 a0 02 00 00 00 00 00 00-00 00 00 00 00 00 00 00 …………….

Let’s have a look at the stack in the vicinity of the 0×2a0 value we found to determine whether this is a false positive or actual use of the handle:
0:006> dps 0×00bdc994-20
00bdc974 74e64eab riched20!CTxtRange::Update_iFormat+0×15c
00bdc978 00000001
00bdc97c 00000000
00bdc980 00000000
00bdc984 00bdfe4c
00bdc988 00000000
00bdc98c 7c90e288 ntdll!NtReadFile+0xc
00bdc990 7c801875 kernel32!ReadFile+0×16c
00bdc994 000002a0
00bdc998 00000000
00bdc99c 00000000
00bdc9a0 00000000
00bdc9a4 00bdc9cc
00bdc9a8 00bdca78
00bdc9ac 00000ffe
00bdc9b0 00000000
00bdc9b4 7c801898 kernel32!ReadFile+0×1ca
00bdc9b8 00bdfe4c
00bdc9bc 00bdfee4
00bdc9c0 00000000
00bdc9c4 00000001
00bdc9c8 00bdfee4
00bdc9cc 00000000
00bdc9d0 00000ffe
00bdc9d4 00000000
00bdc9d8 00000000
00bdc9dc 00bdc9b8
00bdc9e0 00000000
00bdc9e4 00bdff68
00bdc9e8 7c839aa8 kernel32!_except_handler3
00bdc9ec 7c809ba8 kernel32!`string’+0xc
00bdc9f0 ffffffff

We can see the remains of a call to the ReadFile API with a file handle to the suspected RTF file on the offending thread’s stack. Had this process had more threads processing RTF files, we’d be able to make the connection between the CPU hogging thread and the problematic RTF file this way. If the process was using a thread pool and each thread was being recycled to process an RTF file after another, results could have still been inconclusive, however.

For those without Nero 7 on their systems who are wondering what’s the deal with that RTF file, it is, in fact, not a rich text format file at all. Ahead made the mistake of calling a non-RTF file with the RTF file extension, persuading the Indexing Service’s RTF filter to attempt to index it. Given all the talk from Microsoft these days on their Security Development Lifecycle, Threat Modeling and all the fuzzing they claim to be doing, it is curious that relatively modern code like the search engine’s RTF filter would happily go into a CPU hogging frenzy on an RTF file that doesn’t even bother with an RTF opening tag. Some basic sanity validation is amiss here.

I looked it up out of curiosity and apparently CDI_IMAG.RTF has something to do with the now very defunct CD-i technology, which White Book classic VCDs are required to be compatible with. I was surprised there’s actually something older than those Video CDs…

For the time being, I renamed the file so its extension won’t be RTF, to prevent the RTF filter from embarrassing itself and killing my box while doing so. Hopefully Nero won’t mind as long as I stick to 21st century technology.

Another happy ending.

A JScript interactive interpreter shell for the Windows Script Host

September 14, 2007 by Koby Kahane

A few weeks ago a friend of mine who was starting to code WSH scripts asked me if there was an interactive shell for it. I told him I didn’t know of one, but it got me thinking. Python users are well aware of its interactive shell, and indeed, take it for granted. The ability to execute statements immediately, one line at a time, is pretty fundamental. Yet, Windows Script Host offers no such built-in functionality.

As a new WSH script writer, I would often find myself in a cycle of Edit, Save, Run, quite similar to the Edit, Save, Compile cycle of native code, with small snippets or even one-liners. As my scripts exceeded a certain threshold of complexity, I would find myself using the script debugger. Unfortunately, since it provides a read-only view of the debugged script, when I wanted to make my changes and test them, I would have to switch to the editor window, make my changes and start the session all over again.

I recall one of my first programming experiences, around the age of 9. I was toying with my brother’s old Atari 800XL, initially in BASIC. The machine had a BASIC interpreter built-in to its ROM and had a measly 64K of memory. When it was turned on, a friendly “READY” banner written in white over a blue background greeted you to the BASIC interpreter. The ability to interpret statements for rapid modeling was considered so fundamental it was this, not a disk operating system, that was the core of the machine.

Fast forward back to the present. My friend’s question had me searching for a solution. I did not find an interpreter targeting WSH, but I did find a variety of JavaScript shells for the web browser, like this one. These are good candidates for brushing up on the HTML DOM, but are less useful to those using WSH. For instance, attempting to model automation controllers quickly brings you into the realm of warnings and denials from the browser’s security apparatus. In the case of the specific shell in question, its approach of having the user use Shift-Enter for multi-line entry was inconvenient, since you had to keep doing so until your code block was complete.

JScript lends itself well for implementing a self-hosted shell through the “eval” keyword. As I was examining the input mechanisms available to a command-line WSH script, I saw that the WScript.StdIn object was a TextStream, only supporting newline-terminated input. This means I could not implement the same Shift-Enter based approach for multi-line input used by the browser hosted shell mentioned above.

During my search, I also found two JavaScript shells that are not browser-based but do not target the Windows Scripting Host. One was a part of Spidermonkey, which is Mozilla’s classic JavaScript implementation (which is set to be retired and replaced by the JIT-based Tamarin, the open source version of Adobe Flash’s ActionScript VM, in future versions of Firefox). The other was a part of Rhino, an implementation of JavaScript in Java.

I examined their source to determine what was their approach to multi-line input. It appeared that both the Spidermonkey and the Rhino shells used the underlying script language implementation’s functionality for determining whether a given string is a “compilable entity.” They would keep on reading lines until that condition was met.

Unfortunately, it did not seem like I could adopt a similar approach. Calling “eval” repeatedly until successful is problematic. Even if I were to implement the shell in native code using the Active Scripting hosting interfaces instead, it did not appear as though IActiveScript or the related interfaces provided a similar “compile testing” method.

Defeated, I opted for a simple approach where a blank line initiates multi-line input and two consecutive blank lines terminate it.

Pardon me for the coarse, unpolished illustration code:

function hex(n) {
    if (n >= 0) {
        return n.toString(16);
    } else {
        n += 0x100000000;
        return n.toString(16);
    }
}
var scriptText;
var previousLine;
var line;
var result;
while(true) {
    WScript.StdOut.Write("jscript> ");
    if (WScript.StdIn.AtEndOfStream) {
        WScript.Echo("Bye.");
        break;
    }
    line = WScript.StdIn.ReadLine();
    scriptText = line + "n";
    if (line === "") {
        WScript.Echo(
            "Enter two consecutive blank lines to terminate multi-line input.");
        do {
            if (WScript.StdIn.AtEndOfStream) {
                break;
            }
            previousLine = line;
            line = WScript.StdIn.ReadLine();
            line += "n";
            scriptText += line;
        } while(previousLine != "n" || line != "n");
    }
    try {
        result = eval(scriptText);
    } catch (error) {
        WScript.Echo("0x" + hex(error.number) + " " + error.name + ": " +
            error.message);
    }
    if (result) {
        try {
            WScript.Echo(result);
        } catch (error) {
            WScript.Echo("<<<unprintable>>>");
        }
    }
    result = null;
}

This is simple enough and is quite useful for the majority of cases. It does have its disadvantages, however. Notably, the surrounding code of the shell is leaked into the namespace accessible by the interpreted snippets. For example, typing “hex” exposes the error code conversion function. However, for my needs, I found this quite satisfactory.

If anyone can offer an improved implementation, I’d be happy to see it in the comments.

Save this code to a file, like shell.js, and use “cscript shell.js” to start it. Multi-line input is performed as described above. Ctrl-Z can be used to quit.

A nice stunt you can pull with this is wrap the shell in a .WSF referencing your favorite type libraries. For example, consider this shell.wsf:

<job>
    <reference object="Scripting.FileSystemObject" />
    <script language="JScript" src="shell.js" />
</job>

If you start a shell with “cscript shell.wsf”, the shell instance will have access to type library constants like “ForReading”, “ForAppending” and so forth.

Although I’m not much of a VBScript fan, I considered doing something similar for it, since it could be quite handy for testing those pesky automation objects that take SAFEARRAYs and are thus not that JScript friendly. However, VBScript’s distinction between expressions and statements (and its Eval function vs. the Execute & ExecuteGlobal keywords) make such a thing a bit more complicated. It is also not clear whether the interpreter should opt for executing statements using Execute or ExecuteGlobal, and in what cases. If anyone is up for implementing this, I’d love to see it.

Have fun.

A lightweight approach for exposing C++ objects to a hosted Active Scripting engine

September 13, 2007 by Koby Kahane

Microsoft’s Active Scripting architecture allows application developers to host the same implementations of the JScript and VBScript scripting languages used by Internet Explorer for scripts in HTML pages, Active Server Pages (the old, pre-.NET implementation) for server-side dynamic content or the Windows Scripting Host for independent scripts. Additionally, third party scripting engines can and have been developed, for Python, Perl and other interpreted languages.

Hosting a scripting engine involves implementing the IActiveScriptSite interface, providing a method to pass script code to the IActiveScript and IActiveScriptParse interfaces and is extensively documented in the literature. Therefore, I shall not discuss the mechanics of hosting itself and will elaborate only on the topic of exposing objects from the host to the engine.

Enabling scripting in your application only adds value over the external Windows Script Host if you expose unique, internal application functionality to the hosted scripts. If your application already exposes its functionality as COM automation objects to automation controllers that can be used out-of-process, there isn’t much point in hosting. However, if your application is document-oriented, for example, providing scripts with access to the document context can be very useful to your users.

A scripting host can make its object model available to hosted scripts by providing the engine with an IDispatch interface for each object it wants to make available. This interface is the foundation of OLE automation and is used by the scripting languages for late binding.

Since the IDispatch interface is basically a rather raw reflection mechanism, implementing it from scratch for a moderately complex object is tedious and error-prone.

If your application already implements COM objects regardless of scripting, it probably already makes use of a framework for doing so, be it ATL, MFC or the CLR. In that case, you have already paid the framework tax and implementing another interface is no challenge. Specifically, ATL offers the convenient IDispatchImpl class for implementing dual interfaces while the CLR makes it ridiculously simple to implement dispatch interfaces (by default, a .NET class is also a COM dispatch object).

However, a dependency on the CLR might not be a welcome requirement. Similarly, complicating a substantial existing code base with the tedium of COM class registration is an adventure that may not be suitable for the faint of heart. If you do not wish to expose automation objects to external clients like WSH, you have no need or desire to modify the registry and maintain that information across installations, uninstallations, upgrades and the like.

However, both ATL and MFC do not go to any reasonable lengths to facilitate the implementation of internal, unregistered COM objects. The IDispatchImpl class requires that you provide it with type information for your dual interface, but ATL’s only ITypeInfo wrapper, CComTypeInfoHolder, is oriented towards retrieving that from a type library residing in a file, either an independent .TLB or an embedded resource in your .EXE or .DLL file. This means that for exposing an object, you need to describe it in IDL, have your build process generate a .TLB for it with MIDL and possibly embed it as a resource using RC. At run time, you need to take care of the logistics of interface and type library registration. All of this for what you only want as internal functionality.

Apart from being tedious, that approach is also characterized by being rigid and static. Manipulating your exposed objects by making runtime decisions that could change the type information does not go well with them being static embedded resource entities.

I considered what would it to take to come up with binary type information from a source that isn’t a file or a resource. At first glance, the LoadTypeLib API is definitely file-oriented. However, a light bulb turned on in my head when I noticed that if the file name given does not exist, the string is treated as a moniker. I was hoping I could generate binary type information in .TLB format from IDL, store it in a flexible manner and provide LoadTypeLib with a moniker to the type information. I then paused as I realized there was an unanswered question - “a moniker to what?”. As is not uncommon in Microsoft’s documentation, elaboration on this point was scarce. I later found this newsgroup post on the matter. The original poster had the same question as mine and the reply pointed me in the right direction.

Although the responder was incorrect in assuming the pointer moniker implementation actually implemented IMoniker::GetDisplayName, a deficiency for which I can find no excuse, the OBJREF moniker provides a suitable alternative. The OBJREF moniker is a superset of the pointer moniker that supports out-of-process references, although no such functionality is required by me for this purpose, just getting a display name to feed LoadTypeLib.

I promptly implemented a skeleton IUnknown that would simply print what interface was requested on every call to QueryInterface and then return E_NOINTERFACE. I created an OBJREF moniker for this IUnknown implementation and supplied LoadTypeLib with the moniker’s display name. I figured this way, I would figure out what LoadTypeLib is expecting the supplied object to implement as an alternative to being given a file name.

I was disappointed when I saw what happened next - LoadTypeLib was asking my object for an ITypeLib implementation, and nothing else. This basically means that LoadTypeLib’s moniker support is completely useless - it returns an ITypeLib for an ITypeLib you already have.

My next attempt to tap into the existing binary type information parser involved writing a test program that called LoadTypeLib on a .TLB file for the purpose of finding if it loaded the information to memory and then promptly used intermediate functionality on the in-memory data that was also accessible to me. I examined the type library loader’s high level flow using Windbg:
0:000> bp oleaut32!LoadTypeLib
0:000> g
Breakpoint 2 hit
eax=0012ff00 ebx=7ffda000 ecx=81818d85 edx=10313d00 esi=0012fdc8 edi=0012ff5c
eip=771279e5 esp=0012fdbc ebp=0012ff68 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
OLEAUT32!LoadTypeLib:
771279e5 8bff mov edi,edi
0:000> wt -m oleaut32 -l 2 -ns
Tracing OLEAUT32!LoadTypeLib to return address 004117d0
7 0 [ 0] OLEAUT32!LoadTypeLib
43 0 [ 1] OLEAUT32!LoadTypeLibEx
16 0 [ 2] OLEAUT32!InitLoadInfo
44 16 [ 1] OLEAUT32!LoadTypeLibEx
65 0 [ 2] OLEAUT32!InitAppData
51 81 [ 1] OLEAUT32!LoadTypeLibEx
9 0 [ 2] OLEAUT32!LHashValOfNameSys
58 90 [ 1] OLEAUT32!LoadTypeLibEx
25 0 [ 2] OLEAUT32!OLE_TYPEMGR::LookupTypeLib
66 115 [ 1] OLEAUT32!LoadTypeLibEx
46 0 [ 2] OLEAUT32!FindTypeLib
72 161 [ 1] OLEAUT32!LoadTypeLibEx
25 0 [ 2] OLEAUT32!OLE_TYPEMGR::LookupTypeLib
89 186 [ 1] OLEAUT32!LoadTypeLibEx
185 0 [ 2] OLEAUT32!GetOffsetOfResource
101 371 [ 1] OLEAUT32!LoadTypeLibEx
79 0 [ 2] OLEAUT32!CreateFileLockBytesOnHFILE
117 450 [ 1] OLEAUT32!LoadTypeLibEx
23 0 [ 2] OLEAUT32!LoadTypeLib2LockBytes
126 473 [ 1] OLEAUT32!LoadTypeLibEx
17 0 [ 2] OLEAUT32!FileLockBytesMemory::Release
133 490 [ 1] OLEAUT32!LoadTypeLibEx
156 0 [ 2] OLEAUT32!OLE_TYPEMGR::TypeLibLoaded
145 646 [ 1] OLEAUT32!LoadTypeLibEx
15 0 [ 2] OLEAUT32!UninitLoadInfo
155 661 [ 1] OLEAUT32!LoadTypeLibEx
5 0 [ 2] OLEAUT32!__security_check_cookie
157 666 [ 1] OLEAUT32!LoadTypeLibEx
9 823 [ 0] OLEAUT32!LoadTypeLib

It was clear from the trace that LoadTypeLib created an ILockBytes over the .TLB file and promptly provided it to LoadTypeLib2LockBytes. Unfortunately, neither this internal function nor any other leading to its functionality is exported from the OLE automation library. The binary type information parser is not accessible externally for in-memory data. What was missing is that LoadTypeLib did not attempt to QueryInterface for ILockBytes when given a moniker, if ITypeLib is not implemented by the object directly. This approach, therefore, had to be scrapped.

I was hoping I could use MIDL to generate binary type information for me and the notion of implementing ITypeLib completely on my own for in-memory representation seemed like a daunting task. If this is the trade-off, surely reverting to ATL and dealing with the evils of registration would be the better approach?

Not so fast. It turns out there is another approach for getting the type information you need for exposing your C++ object, without generating a full-fledged type library or implementing your own type information provider. The marvelous CreateDispTypeInfo API. You provide it with a INTERFACEDATA structure describing your object and get the type information you need. Combined with CreateStdDispatch, it becomes easy to expose simple objects to automation.

Reviewing the sample included in the MSDN documentation of CreateDispTypeInfo is indicative of the sorry state of affairs in Microsoft’s documentation group, seeing as it is quite incomplete and makes use of macros like METHOD0, METHOD1 and PROPERTY, which are nowhere to be found and must have existed in whatever project the sample code has been copy-pasted from. Detailed discussion of the function’s usage is scarce, but existent, on the Web, primarily in newsgroups. Allow me to illustrate with an example. Consider the following hypothetical C++ class one wishes to expose to scripting:

class MyObject
{
public:
virtual void __stdcall f(int i);
virtual BOOL __stdcall g(float f);
};

As is evident, this class is pretty plain and certainly has nothing to do with COM. Just the sort of class your existing application with no use of COM might have. To expose it, we need to fill some descriptor structures so type information can be generated for it. We add a few static members:
class MyObject
{
public:
virtual void __stdcall f(int i);
virtual BOOL __stdcall g(float f);
static PARAMDATA f_paramData;
static PARAMDATA g_paramData;
static METHODDATA methodData[];
static INTERFACEDATA interfaceData;
};

Let’s fill those babies up:

PARAMDATA MyObject::f_paramData = {
OLESTR(”i”), VT_I4
};
PARAMDATA MyObject::g_paramData = {
OLESTR(”f”), VT_R4
};
METHODDATA MyObject::methodData[] = {
{ OLESTR(”f”), &MyObject::f_paramData, 1, 0, CC_STDCALL, 1, DISPATCH_METHOD, VT_EMPTY },
{ OLESTR(”g”), &MyObject::g_paramData, 2, 1, CC_STDCALL, 1, DISPATCH_METHOD, VT_BOOL }
};
INTERFACEDATA MyObject::interfaceData = {
MyObject::methodData,
sizeof(MyObject::methodData) / sizeof(METHODDATA)
};

For each method of our object, we describe the method’s parameters, giving them name and type in a PARAMDATA structure. We then fill a method table for the object with complete information, including the parameter data, return value type, calling convention and such. The INTERFACEDATA wraps the whole thing in a nice little package to feed CreateDispTypeInfo with.

We now proceed to create an automation wrapper for our pure object:

CComPtr<ITypeInfo> pMyobjTypeInfo;
hr = CreateDispTypeInfo(
&MyObject::interfaceData,
LOCALE_SYSTEM_DEFAULT,
&pMyobjTypeInfo);
CComPtr<IUnknown> pMyobj;
hr = CreateStdDispatch(NULL, &myobj, pMyobjTypeInfo, &pMyobj);

At this point, pMyobj is a full fledged COM object implementing IDispatch and wrapping the MyObject class instance myobj, which had no knowledge of COM originally and now bundles tables describing its methods.

The scripting site’s implementation of IActiveScriptSite::GetItemInfo should now return pMyObj, the object’s IUnknown and potential IDispatch, and pMyobjTypeInfo, its ITypeInfo, when requested to do so by the hosted scripting engine. We register the object we wish to expose with the engine:

hr = pActiveScriptEngine->AddNamedItem(
L”myobject”,
SCRIPTITEM_ISSOURCE | SCRIPTITEM_ISVISIBLE | SCRIPTITEM_ISPERSISTENT);

If our GetItemInfo does its job when asked for “myobject”, assuming we host the JScript engine, we can now do things like
myobject.f();
var b = myobject.g(0.4);

in script code running in our host.

I find this approach to automation object exposition attractive because it is non-intrusive. If desired, the tables describing the exposed class need not be members of the actual class, but can be stored separately. Notice that you do not even have to generate a CLSID for the exposed class. It is also possible to expose only a certain subset of class methods to the scripting environment.

However, maintaining the type information tables can become a clear scalability issue with more complicated classes. For these cases, rolling an automatic code generation solution may be desired, since MIDL’s functionality in this department cannot be reused. The class and its methods could be described in an XML file, and a tool iterating over its DOM or even an XSLT transformation could generate a C++ header file from the description, complete with the INTERFACEDATA information. This would ensure the method tables and the actual method signatures remain synchronized over the extended life-time of the class.

Finally, a pointer to some tips and a few words of caution to those interested in this solution,