Bridging the gap between native functions and Active Scripting with a COM-based FFI wrapper

A few weeks ago I was following the excitement as WebKit, Safari’s browser engine, incrementally passed more and more of the Acid 3 standards test. Wondering if the Gecko (Mozilla Firefox’s rendering engine) folks are also busy with that, I followed both the Planet WebKit and Planet Mozilla feeds for a few weeks.

Sometime in April I stumbled upon this post in Planet Mozilla. It discussed recent improvements to JSctypes. It was the first time I had heard of this project. JSctypes is an XPCOM component for Mozilla that allows calling native (or “foreign”) functions from privileged JavaScript code. Both the interface and name are inspired by the Python ctypes module, included with the standard distribution since version 2.5.

If you haven’t heard of ctypes, take a minute to get acquainted. It’s a great library that allows you to call native C functions dynamically from Python code. Its interface really feels at home in a dynamic language. Most of the time, you can just call functions without specifying the number and types of the arguments they receive. DLL modules can be accessed as attributes of the module attribute matching their calling convention (e.g., ctypes.windll.kernel32 or ctypes.cdll.msvcrt) and script functions can be passed as callbacks to the native APIs being invoked.

JSctypes takes Python’s ctypes concept into Mozilla’s JavaScript implementation. Mozilla has a COM-like architecture at the base of its object model which is called XPCOM. Usually, calling native functionality from JavaScript is achieved by exposing an XPCOM component to script. However, such an approach has clear disadvantages as every conceivable native functionality needs to be wrapped on a case by case basis by a compiled XPCOM component. Now, with JSctypes, Mozilla’s JavaScript code, when privileged (obviously a native call interface is not appropriate in the context of untrusted web content), can call most native functions with relative ease and without a compiled component, aside from JSctypes itself.

A native function interface for a dynamic language needs to deal with the relatively complex task of setting up the call stack frame for an arbitrary native API, according to argument counts, types and alignment requirements deduced dynamically at script execution time. As the interface layer seeks to support a broader and broader variety of argument types (basic data types, then structures, arrays, then callback functions, etc.) the task becomes increasingly complicated and difficult.

I reviewed both JSctypes and Python’s ctypes source code in their respective source code repositories and learned that they both share a common implementation of the lowest component in such a native interface layer. It is called libffi, the Foreign Function Interface library and seems to originate from the gcc project. Since libffi is designed to be compiled with a UNIX-style toolchain (has AT&T syntax assembly files, for instance) and Python needs to compile with Visual C++, the author of ctypes, Thomas Heller, ported an old revision of the library to Visual C++.

Usage of libffi is pretty simple. You initialize an ffi_cif (call information?) structure with the ABI type, return value type, argument count and argument types of the native function to be invoked by using the ffi_prep_cif function. Later, and repeatedly as needed, ffi_call is used to call the actual function with a specific set of argument values, passed in as an array and to retrieve the value returned from the native function.

I thought JSctypes is really cool and it then occurred to me it should not be prohibitively difficult to implement a similar adaptation layer for Microsoft’s JScript and possibly other Active Scripting languages.

In my mind’s eye, I envisioned an in-process COM server accessible to Active Scripting clients (implements IDispatch and associated with a ProgID) providing a call interface to arbitrary native functions.

I created an ATL COM DLL and gave the coclass the ProgID “FunctionWrapper.FunctionWrapper.1”. I knew you could call JScript functions with less or more arguments than they expect in their definition and figured pulling off the same in a native method I’ll expose to the script would be ideal. After a short investigation I learned of the IDL vararg attribute, which accomplishes just what I had in mind. At this point, the exposed interface looks like this:

[
object,
uuid(EBA4A11F-969B-4413-9D4E-FB5CB21039FC),
dual,
nonextensible,
helpstring("IFunctionWrapper Interface"),
pointer_default(unique)
]
interface IFunctionWrapper : IDispatch {
[id(1), helpstring("method CallFunction"), vararg] HRESULT CallFunction([in] SAFEARRAY(VARIANT) args, [out, retval] VARIANT* retVal);
};

The CallFunction method of the FunctionWrapper object is callable by JScript clients with arguments of arbitrary count and type of their choosing. As a simplistic start, I had the first argument specify a string identifying the native function, in the Windbg-inspired syntax of “module!export”, e.g. “user32!MessageBoxW”. The rest of the arguments would be passed to the native function.

I proceeded to implement CFunctionWrapper::CallFunction. The steps taken by the method would be:

  1. Ensure at least the first argument (function to invoke) was given.
  2. Ensure the first argument specifies a module and an export, load the module and retrieve the address of the export.
  3. Thunk the VARIANT arguments received by the method to libffi-style argument and types arrays.
  4. Invoke ffi_prep_cif to prepare the call and call the native function with ffi_call
  5. Thunk the return value of the function into a VARIANT usable by script.

Much of the work here is concise but stage 3 consists of relatively mundane boilerplate, translating two varieties of dynamically typed data, Microsoft’s VARIANT and libffi’s ffi_type. I’ll illustrate with a short snippet:

for (ULONG i = 1; i < arguments.GetCount(); i++)
{
ffi_type* argumentTypes = ...; // Dynamically allocated by argument count
void* ffiArgs = ...;


VARIANT& arg = arguments[i];
switch (V_VT(&arg)) {
case VT_UI1:
argumentTypes[i - 1] = &ffi_type_uint8;
ffiArgs[i - 1] = &(V_UI1(&arg));
break;
...
case VT_UI4:
argumentTypes[i - 1] = &ffi_type_uint32;
ffiArgs[i - 1] = &(V_UI4(&arg));
break;
}
}

Similar work is needed for other integer and floating-point types, strings and pointers.

Initially, I hard-coded a return value type of unsigned 32-bit integer and the stdcall calling convention to avoid providing an interface for selecting those parameters. I registered the DLL and tested the following script with WSH:

var functionWrapper = new ActiveXObject("FunctionWrapper.FunctionWrapper");
var retVal = functionWrapper.CallFunction("user32!MessageBoxW", 0, "text", "caption", 1);
WScript.Echo(retVal);

1 is also the value of the MB_OKCANCEL parameter to MessageBox. I used the W variety of the API since I implemented hardcoded UTF16 marshalling for VT_BSTR type variants, which is the form strings come in from JScript.

I was quite content when the test script not only failed to crash the WSH process, but also successfully presented a message box and provided the API’s return value successfully back to JScript.

At this point I considered what would it take to extend this solution beyond the basic value types. Arrays first came to mind. Such support, I imagined, would consist of copying an incoming SAFEARRAY argument into a native array and supplying the native array pointer to the native function. If “out” array argument support is desired, copying back into the SAFEARRAY would be required post-invocation, right after ffi_call.

Next in line were structs. These would be less straightforward. The problem with filling a JScript “object” (read, hash table) with a struct’s fields is that ordering would not be preserved as the order in the struct’s data layout. Using the hash as a JScript array would provide ordering, although it wouldn’t be very nice looking.

The final type of argument I considered, and arguably the most important, is callbacks. Many APIs take function pointers as arguments. Consider EnumWindows which invokes EnumWindowsProc on every window found. A native call interface should provide a capability to implement the callback as JScript function and pass it as seamlessly as possible during the native invocation.

Fortunately, libffi provides built-in support for callbacks, calling them “closures” in its terminology. An ffi_cif structure is initialized to describe the prototype of the callback function, in native eyes, as it if it were going to be called with ffi_call. ffi_prep_closure takes such a prototype description, a function pointer and a closure “trampoline buffer”, as I call it. The trampoline buffer, expected to be allocated in writable, executable memory (native code would later jump into its address) takes care of calling the provided function pointer. The twist is that the function pointer, instead of being called with a dynamic prototype, always receives its arguments in the form of libffi argument arrays.

The native callback function wrapped by the closure trampoline buffer would presumably fill a SAFEARRAY of variants with the arguments and invoke a script function. A wrapper callback coclass could be provided to the script and allow for more elaborate stuff like out parameters and the like. An instance of the callback object would wrap a JScript function object and invoke its apply method using the IDispatch interface as calls come in through the closure. It is unclear what a generic solution that doesn’t rely on functions being objects and having the apply method would look like, so at this point this wrapper callback concept is only suitable for JScript.

Right now I only got as far as implementing just the basic value types, and even that with code of such poor quality I avoid uploading it for the time being. The devil is in the details and supporting describing complex argument types would require quite a bit of work. Hopefully someday I or perhaps an enthusiastic reader would get around to coding and publishing a full-fledged implementation of a native call interface. Embedding such an interface in an Active Scripting host in scenarios where the hosted scripts enjoy full trust could provide endless extensibility possibilities for the script author.

Hey, cooler than P/Invoke…

Advertisements

Deploying the Visual C++ libraries with an NSIS installer

Beginning with Visual C++ 2005 and continuing into Visual C++ 2008 and the foreseeable future, Microsoft’s various runtime libraries (CRT, ATL, MFC, etc.) are no longer installed into the system32 directory on Windows XP and later, but are rather “side-by-side assemblies” that need to be installed into the side-by-side store, “WinSxS”, in order to be available to all applications.

I’ve discussed the SxS store and the API Microsoft has documented for managing it in a previous post. Nevertheless, at the request of the NSIS maintainer, kichik, I’ll provide some guidance on the issue of runtime deployment and concrete examples to authors of NSIS-based installations. Do keep in mind that I am not adept at authoring NSIS installers and questions beyond the realm of the matter at hand are best targeted at the NSIS forum.

Unlike in the Linux world, where the C runtime library is considered an operating system component and versions of it are never installed by applications (at worst, some proprietary application is linked against an antique version of libc and requires the system administrator to install a compatibility package provided by the distribution), the CRT situation on Windows is more complicated. In the days of yore, Windows NT provided the now long defunct CRTDLL.DLL. Later, the newer variant MSVCRT.DLL shipped with Visual C++, going into the 6.0 release. However, in addition to serving as the runtime of a specific Visual C++ version, MSVCRT.DLL doubles as the “OS CRT”, the version of the C++ runtime deployed with the OS as far back as NT 4.0 and going into Windows Vista. Components included with the operating system itself, such as Notepad and Calculator, are linked against this CRT dynamically. Do not let the identical moniker fool you, the CRTs included with the various NT releases diverge significantly, sporting, for example, a brand new exception handling runtime in Windows Vista, aligned with newer Visual C++ compilers.

The existence of several MSVCRT.DLL variants and the associated versioning issues are probably what led Microsoft to adopt a policy of strongly versioned CRTs beginning with the Visual C++ .NET (2002) release. MSVCR70.DLL was the runtime required by the output of that product, and later versions would require deploying MSVCR71.DLL, MSVCR80.DLL and most recently MSVCR90.DLL. In addition to the CRT itself, there are also the various peripheral libraries that some applications may depend on, such as ATL and MFC.

I’ve discussed in the past an approach utilizing the Windows Driver Kit build environment that allows combining a modern C++ compiler with targeting the Visual C++ 6.0 / OS CRT, MSVCRT.DLL. It is for the brave who don’t mind getting their hands dirty and whose desire to target the broadly deployed runtime exceeds the fear of the plethora of potential version compatibility issues such an application configuration can cause.

For the more conservative lot, the question remains, how do I get the new C++ runtimes to my end-user’s machine? The first approach is that of utilizing static linking. It should be avoided at nearly all cost as it results in both obese executables that are unable to share the runtime’s memory pages with other running processes and is completely unservicable by Microsoft when a security update or another bug fix to all users of the runtime libraries needs to be broadly deployed.

We therefore turn our attention to approaches based on dynamic linking. First of all, the reader should review the official guidance provided by the Visual C++ team on the matter, although he or she may not like what they read. To summarize, Microsoft officially supports the following deployment methods:

  • Use an MSI-based (Windows Installer) installation procedure and utilize their MSM merge modules to include whatever runtime components you require with your application. The MSMs are black box magic that will get those runtime libraries into the “winsxs” store without asking too many questions. If you don’t like those massively complicated MSI installers and the WiX XML schemas make your head spin, that’s too bad.
  • Use the obese VCRedist.exe for the target architecture, without the benefit of picking and choosing only those runtime components that are of interest for your specific application.
  • Deploy the runtime libraries as files in your application’s directory, or “private assemblies” in SxS nomenclature, wasting the end-user’s hard disk space with multiple copies. This is not as bad as it seems, since at least SxS redirection policies can make an updated, security patched version from the “winsxs” store be loaded in place of out of date version deployed privately with the application, unlike with classic non-SxS local DLLs or with static linking.

As the popularity of NSIS as an installation apparatus shows, not everyone are willing to be strong-armed into an MSI-based installation just yet. So how do CRT deploying installers address this acute issue? I was disappointed, but not surprised, to see that VLC, DivX and various other applications with NSIS-based installers, opt for the “private assembly” approach, simplifying life for the installation author but needlessly wasting end-user disk real estate.

The now documented SxS API provides an alternative approach, presumably supported by Microsoft for deploying SxS assemblies in general (such as your own) but not specifically by the Visual C++ folks for theirs. The motivation for this lack of support is unclear, since the end result is as servicable by them as is using Windows Installer merge modules. Nevertheless, it is something that those who follow this path should be aware of.

OK, so let’s get on with it. Unlike with system32, we can’t just waltz into winsxs and drop our assembly’s files there. The directory structure is complicated, differs between XP and Vista, and in fact the ACL on the directory in Vista won’t allow anyone but TrustedInstaller (i.e., MSI) to touch it. Therefore we are required to perform the installation through the SxS API, which provides a COM-based interface for manipulating the store.

For illustration purposes, I shall use the Visual C++ 2005 (8.0) Debug CRT. Note that this is not the CRT you want to deploy to your end users, and in any case is explicitly NOT redistributable by Microsoft’s license terms. I use it for illustrative convenience since my XP virtual machine doesn’t have this assembly. We’ll use an NSIS installer script to drive the wonderful though peculiar System plug-in and get it to invoke the SxS API. Note that elaborate error handling is omitted for brevity. So here we go:


Name "NSIS SxS Test"
OutFile "nsissxs.exe"
SetPluginUnload alwaysoff
ShowInstDetails show
XPStyle on
SetCompressor /SOLID lzma
InstallDir $PROGRAMFILES\NSISSxS


!define FUSION_REFCOUNT_UNINSTALL_SUBKEY_GUID {8cedc215-ac4b-488b-93c0-a50a49cb2fb8}

Section "Uninstall"
DeleteRegKey "HKLM" "Software\Microsoft\Windows\CurrentVersion\Uninstall\nsissxs"
Delete $INSTDIR\uninst.exe
Delete $INSTDIR\dummy.txt
RMDir $INSTDIR
DetailPrint "Removing DebugCRT assembly..."
System::Call "sxs::CreateAssemblyCache(*i .r0, i 0) i.r1"
StrCmp $1 0 0 fail
System::Call "*(i 32, i 0, i 2364391957, i 1217113163, i 178634899, i 3090139977, w 'nsissxs', w '') i.s"
Pop $2
System::Call "$0->3(i 0, w 'Microsoft.VC80.DebugCRT,version=$\"8.0.50727.762$\",type=$\"win32$\",processorArchitecture=$\"x86$\",publicKeyToken=$\"1fc8b3b9a1e18e3b$\"', i r2, *i . r3) i.r1"
StrCmp $1 0 0 fail2
DetailPrint "Disposition returned is $3"
System::Call "$0->2()"
Goto end
fail:
DetailPrint "CreateAssemblyCache failed."
DetailPrint $1
Goto end
fail2:
DetailPrint "UninstallAssembly failed."
DetailPrint $1
Goto end
end:
SectionEnd

Section
SetOutPath $INSTDIR
File "dummy.txt"
WriteUninstaller $INSTDIR\uninst.exe
WriteRegStr "HKLM" "Software\Microsoft\Windows\CurrentVersion\Uninstall\nsissxs" "DisplayName" "NSIS SxS Test"
WriteRegStr "HKLM" "Software\Microsoft\Windows\CurrentVersion\Uninstall\nsissxs" "UninstallString" "$INSTDIR\uninst.exe"
InitPluginsDir
SetOutPath $PLUGINSDIR
File "msvcm80d.dll"
File "msvcp80d.dll"
File "msvcr80d.dll"
File "x86_Microsoft.VC80.DebugCRT_1fc8b3b9a1e18e3b_8.0.50727.762_x-ww_5490cd9f.cat"
File "x86_Microsoft.VC80.DebugCRT_1fc8b3b9a1e18e3b_8.0.50727.762_x-ww_5490cd9f.manifest"

DetailPrint "Installing DebugCRT assembly..."
System::Call "sxs::CreateAssemblyCache(*i .r0, i 0) i.r1"
StrCmp $1 0 0 fail
# Fill a FUSION_INSTALL_REFERENCE.
# fir.cbSize = sizeof(FUSION_INSTALL_REFERENCE) == 32
# fir.dwFlags = 0
# fir.guidScheme = FUSION_REFCOUNT_UNINSTALL_SUBKEY_GUID
# fir.szIdentifier = "nsissxs"
# fir.szNonCanonicalData = 0
System::Call "*(i 32, i 0, i 2364391957, i 1217113163, i 178634899, i 3090139977, w 'nsissxs', w '') i.s"
Pop $2
# IAssemblyCache::InstallAssembly(0, manifestPath, fir)
System::Call "$0->7(i 0, w '$PLUGINSDIR\x86_Microsoft.VC80.DebugCRT_1fc8b3b9a1e18e3b_8.0.50727.762_x-ww_5490cd9f.manifest', i r2) i.r1"
System::Free $2
StrCmp $1 0 0 fail2
System::Call "$0->2()"
Goto end
fail:
DetailPrint "CreateAssemblyCache failed."
DetailPrint $1
Goto end
fail2:
DetailPrint "InstallAssembly failed."
DetailPrint $1
Goto end
end:
SectionEnd

If you are not familiar with NSIS script syntax, now would be a good time to get acquainted. Let us review the contents of the 2nd section, which is the install section. The dummy file is a placeholder for the actual files your installer wants to deploy. Next, we set up an Uninstall entry in the registry as one usually would. Now on to the interesting part.

In order to deploy a SxS assembly, we must place its DLLs together in a temporary directory created by the installer. Note that if an assembly contains several DLLs, we cannot pick and choose only those that our application links with. The assembly is deployed, versioned and bound as a whole. We can figure out which files are part of the assembly in question by reviewing the assembly manifest, which we’ll find installed into the Manifests subdirectory of the WinSxS store on a Windows XP system. If we review the Debug CRT’s manifest, we can see <file> nodes under the <assembly> node, each referencing one of the files that must be deployed with the assembly. You can find the actual assembly files under the subdirectory with the assembly’s strong name in the WinSxS store.

In addition to the DLL files themselves, the assembly manifest and the assembly signing catalog are an integral part of the assembly. The catalog ensures the integrity of the assembly and is a welcome feature over traditional DLL deployment.

With the DLLs, assembly manifest and catalog in place, we are ready to invoke the SxS API for assembly installation. First, we call CreateAssemblyCache to retrieve the IAssemblyCache interface for managing the SxS store. Note that in the context of an NSIS installer, COM has already been initialized (for STA use) at this point, but if you are making a custom installer in another environment you may have to take care of that before reaching this point.

Assuming all goes well the next phase is setting up the FUSION_INSTALL_REFERENCE structure that will describe our assembly installation. Typically, you’ll want the reference to be associated with the registry Uninstall key for your application. Besides, other reference types do not seem to work too well and the documentation doesn’t err on the side of verbosity.

The not-so-seasoned NSIS scripter that I am, I couldn’t figure out a more legible way to specify the GUID argument to the InstallAssembly invocation so I broke down its components by hand. Counting vtable indices including the IUnknown and IAssemblyCache interfaces, InstallAssembly is at vtable slot 7. After the install reference structure is set up and the method invoked, we hope for the best.

Assuming a successful install transaction, we proceed to call the IUnknown method Release (vtable slot 2) to free the SxS cache manager and deem our install sequence completed.

We now turn our attention to the reverse sequence in the Uninstall section of the illustration installer. Being good citizens of the Windows ecosystem, we remove our reference to the shared assembly when the end-user removes our application from their system. WinSxS manages assembly reference counting and will figure out whether the assembly files should actually be removed from the disk.

We create an IAssemblyCache interface instance as before but this time call UninstallAssembly to remove our reference. This is the first method of the interface but is preceded by the IUnknown members and is thus at vtable slot 3. Following a successful invocation we can examine the returned Disposition value if it is of interest and proceed to free the instance.

Note that we remove an assembly by its full, strong name and not by path. You can figure out the assembly’s strong name from its manifest.

OK, installing the VC8 Debug CRT was easy enough. Note that other libraries (ATL, etc.) you’ll want to install may have dependencies on other assemblies, so make sure you get your install sequence in order.

Installing the Visual C++ 2005 runtime is nice and all, but somehow it just feels wrong installing obsolete software. I turned my attention to the Visual C++ 2008 libraries, encountering disappointing results.

I gave it a few shots but installing the Visual C++ 2008 Debug CRT always fails, InstallAssembly promptly returning an HRESULT containing ERROR_SXS_PROTECTION_CATALOG_NOT_VALID. A malformed catalog? In one of Microsoft’s very own assemblies? Say it ain’t so!

If you have the hots for deploying the newer runtime, you’ll have to figure out that one on your own, folks. I made sure the catalog for the Visual C++ 9.0 Debug CRT I picked up from the WinSxS store matches the same catalog file found in the MSI merge module (MSM) at C:\Program Files\Common Files\Merge Modules\Microsoft_VC90_DebugCRT_x86.msm by extracting the MSM’s files with the useful MSIX extractor. The catalog files matched and regardless, the SHA-1 hashes for the assembly files matched the catalog rejected by InstallAssembly. Mysterious.

Reviewing the Windows event log following this error didn’t help too much. The System log was now decorated with SideBySide Event ID 20, stating: “The manifest C:\WINDOWS\WinSxS\InstallTemp\160585\Manifests\x86_Microsoft.VC90.DebugCRT_1fc8b3b9a1e18e3b_9.0.21022.8_x-ww_597c3456.Manifest does not match its source catalog or the catalog is missing.” … No newsflash there.

I figured I’ll stick to VC8 support and leave the VC9 troubleshooting for later. If push comes to shove, one can always figure out how to write WiX installers. :)

Instead I opted to review whether this NSIS-based installation approach is compatible with Windows Vista. I was worried that with the restrictive ACLs on the WinSxS store in that OS, without being an MSI and running from the context of the mighty TrustedInstaller.exe process, the installation will surely fail.

I was therefore positively surprised when the test installer worked on Vista. This surprised me since I knew the elevated installer executable ran as Administrator, but there was no denying I was not supposed to be able to copy files to WinSxS:

C:\Users\User\Desktop>cacls C:\Windows\WinSxS
C:\Windows\winsxs NT SERVICE\TrustedInstaller:(OI)(CI)F
BUILTIN\Administrators:(OI)(CI)R
NT AUTHORITY\SYSTEM:(OI)(CI)R
BUILTIN\Users:(OI)(CI)R

There was no denying it. Users and Administrators alike have read-only access to the store, and only the TrustedInstaller service can actually modify it. I opted to run the installer once again, this time in Windbg, tracing the operation of the SxS API to figure out what was happening behind the scenes.


0:000> sxe ld:sxs
0:000> g
ModLoad: 75500000 7555f000 C:\Windows\system32\sxs.dll
eax=1000162a ebx=00000000 ecx=15bf8bb6 edx=00000007 esi=7ffdd000 edi=20000000
eip=76f99a94 esp=0278f620 ebp=0278f664 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!KiFastSystemCallRet:
76f99a94 c3 ret
0:002> bp sxs!CreateAssemblyCache
0:002> g
Breakpoint 0 hit
eax=00000000 ebx=002ceda8 ecx=002cedc8 edx=002ce990 esi=002ceda8 edi=00000000
eip=7554a3aa esp=0278fd54 ebp=0278fd6c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
sxs!CreateAssemblyCache:
7554a3aa 8bff mov edi,edi
0:002> kb 2
*** WARNING: Unable to verify checksum for C:\Users\User\AppData\Local\Temp\nspBB71.tmp\System.dll
ChildEBP RetAddr Args to Child
0278fd50 100024b5 00291a98 00000000 1000162a sxs!CreateAssemblyCache
0278fd6c 1000168d 002ceda8 00000000 75bbc780 System+0x24b5

We know from MSDN that CreateAssemblyCache returns the IAssemblyCache pointer through the first, out parameter. We expect a well-behaved caller to pass in storage initialized to zero, and the storage to contain the newly instantiated interface after the function returns:

0:002> dps 00291a98 L1
00291a98 00000000
0:002> gu
eax=00000000 ebx=002ceda8 ecx=00000000 edx=00000008 esi=002ceda8 edi=00000000
eip=100024b5 esp=0278fd60 ebp=0278fd6c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
System+0x24b5:
100024b5 a340400010 mov dword ptr [System+0x4040 (10004040)],eax ds:0023:10004040={sxs!CreateAssemblyCache (7554a3aa)}
0:002> dps 00291a98 L1
00291a98 00291c88

The first pointer value in the storage reached when following a dereferenced interface pointer is a member function vtable. We verify this and follow the vtable to examine where the implementations of the various members reside:

0:002> dps 00291c88 L1
00291c88 755036f0 sxs!CAssemblyCache::`vftable'
0:002> dps 755036f0 L10
755036f0 75549ac4 sxs!CAssemblyCache::QueryInterface
755036f4 7550de6b sxs!CAssemblyCache::AddRef
755036f8 7554a355 sxs!CAssemblyCache::Release
755036fc 75549d35 sxs!CAssemblyCache::UninstallAssembly
75503700 75549b15 sxs!CAssemblyCache::QueryAssemblyInfo
75503704 7554a219 sxs!CAssemblyCache::CreateAssemblyCacheItem
75503708 755542f1 sxs!XMLParser::SetFlags
7550370c 75549e91 sxs!CAssemblyCache::InstallAssembly
75503710 7554a4d4 sxs!CAssemblyName::QueryInterface
75503714 7550de6b sxs!CAssemblyCache::AddRef
75503718 7554ad69 sxs!CAssemblyName::Release
7550371c 7554a525 sxs!CAssemblyName::SetProperty
75503720 7554a64d sxs!CAssemblyName::GetProperty
75503724 7554a4ba sxs!CAssemblyName::Finalize
75503728 7554aaac sxs!CAssemblyName::GetDisplayName
7550372c 7554a4ad sxs!CAssemblyName::Reserved

It is clear the implementation of the InstallAssembly method is sxs!CAssemblyCache::InstallAssembly. We set up a breakpoint, proceed there and perform a high-level trace:

0:002> bp sxs!CAssemblyCache::InstallAssembly
0:002> g
Breakpoint 1 hit
eax=00000000 ebx=002d2668 ecx=002d2688 edx=002cdc10 esi=002d2668 edi=00000000
eip=75549e91 esp=0278fd4c ebp=0278fd6c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
sxs!CAssemblyCache::InstallAssembly:
75549e91 8bff mov edi,edi
0:002> wt -l 2 -m sxs
Tracing sxs!CAssemblyCache::InstallAssembly to return address 100024b5
16 0 [ 0] sxs!CAssemblyCache::InstallAssembly
13 0 [ 1] sxs!CFrame::CFrame
20 13 [ 0] sxs!CAssemblyCache::InstallAssembly
3 0 [ 1] sxs!CFrame::BaseEnter
11 0 [ 2] sxs!FusionpRtlPushFrame
6 11 [ 1] sxs!CFrame::BaseEnter
54 30 [ 0] sxs!CAssemblyCache::InstallAssembly
6 0 [ 1] sxs!CFrame::ClearLastError
58 36 [ 0] sxs!CAssemblyCache::InstallAssembly
10 0 [ 1] sxs!SxspTranslateReferenceFrom
13 0 [ 2] sxs!CFrame::CFrame
14 13 [ 1] sxs!SxspTranslateReferenceFrom
6 0 [ 2] sxs!CFrame::BaseEnter
70 19 [ 1] sxs!SxspTranslateReferenceFrom
9 0 [ 2] sxs!CFnTracerWin32::~CFnTracerWin32
74 28 [ 1] sxs!SxspTranslateReferenceFrom
63 138 [ 0] sxs!CAssemblyCache::InstallAssembly
4 0 [ 1] sxs!CFrame::ClearLastError
66 142 [ 0] sxs!CAssemblyCache::InstallAssembly
16 0 [ 1] sxs!SxsInstallW
13 0 [ 2] sxs!CFrame::CFrame
20 13 [ 1] sxs!SxsInstallW
6 0 [ 2] sxs!CFrame::BaseEnter
25 19 [ 1] sxs!SxsInstallW
13 0 [ 2] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
28 32 [ 1] sxs!SxsInstallW
13 0 [ 2] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
30 45 [ 1] sxs!SxsInstallW
13 0 [ 2] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
68 58 [ 1] sxs!SxsInstallW
4 0 [ 2] sxs!CFrame::ClearLastError
713 62 [ 1] sxs!SxsInstallW
72 0 [ 2] sxs!SxspExpandRelativePathToFull
797 134 [ 1] sxs!SxsInstallW
4 0 [ 2] sxs!CFrame::ClearLastError
800 138 [ 1] sxs!SxsInstallW
ModLoad: 741e0000 741ea000 C:\Windows\system32\sxsstore.dll
eax=ffffffff ebx=00000000 ecx=002e6a60 edx=00000001 esi=7ffdd000 edi=20000000
eip=76f99a94 esp=0278ed48 ebp=0278ed8c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!KiFastSystemCallRet:
76f99a94 c3 ret

OK, so it looks like this implementation lets sxs!SxsInstallW do the actual work. We rerun the installer and this time perform a trace from that point:

0:000> sxe ld:sxs
0:000> g
ModLoad: 75500000 7555f000 C:\Windows\system32\sxs.dll
eax=1000162a ebx=00000000 ecx=0f78c21d edx=00000007 esi=7ffdc000 edi=20000000
eip=76f99a94 esp=026ef620 ebp=026ef664 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!KiFastSystemCallRet:
76f99a94 c3 ret
0:002> bp sxs!SxsInstallW
0:002> g
Breakpoint 0 hit
eax=026efce8 ebx=0032db90 ecx=026efd08 edx=0032dbac esi=00332a08 edi=026efd44
eip=755475ad esp=026efcd4 ebp=026efd48 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
sxs!SxsInstallW:
755475ad 8bff mov edi,edi
0:002> wt -l 2 -m sxs
Tracing sxs!SxsInstallW to return address 75549f41
16 0 [ 0] sxs!SxsInstallW
13 0 [ 1] sxs!CFrame::CFrame
20 13 [ 0] sxs!SxsInstallW
3 0 [ 1] sxs!CFrame::BaseEnter
11 0 [ 2] sxs!FusionpRtlPushFrame
6 11 [ 1] sxs!CFrame::BaseEnter
25 30 [ 0] sxs!SxsInstallW
10 0 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
12 0 [ 2] sxs!CGenericBaseStringBuffer::InitializeInlineBuffer
13 12 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
28 55 [ 0] sxs!SxsInstallW
10 0 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
12 0 [ 2] sxs!CGenericBaseStringBuffer::InitializeInlineBuffer
13 12 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
30 80 [ 0] sxs!SxsInstallW
10 0 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
12 0 [ 2] sxs!CGenericBaseStringBuffer::InitializeInlineBuffer
13 12 [ 1] sxs!CGenericStringBuffer<64,CUnicodeCharTraits>::CGenericStringBuffer<64,CUnicodeCharTraits>
68 105 [ 0] sxs!SxsInstallW
4 0 [ 1] sxs!CFrame::ClearLastError
713 109 [ 0] sxs!SxsInstallW
15 0 [ 1] sxs!SxspExpandRelativePathToFull
13 0 [ 2] sxs!CFrame::CFrame
19 13 [ 1] sxs!SxspExpandRelativePathToFull
6 0 [ 2] sxs!CFrame::BaseEnter
22 19 [ 1] sxs!SxspExpandRelativePathToFull
29 0 [ 2] sxs!CGenericStringBufferAccessor::Attach
23 48 [ 1] sxs!SxspExpandRelativePathToFull
4 0 [ 2] sxs!CFrame::ClearLastError
29 52 [ 1] sxs!SxspExpandRelativePathToFull
13 0 [ 2] kernel32!GetFullPathNameW
36 65 [ 1] sxs!SxspExpandRelativePathToFull
31 0 [ 2] sxs!CGenericStringBufferAccessor::Detach
37 96 [ 1] sxs!SxspExpandRelativePathToFull
4 0 [ 2] sxs!CFrame::ClearLastError
41 100 [ 1] sxs!SxspExpandRelativePathToFull
68 0 [ 2] sxs!CGenericBaseStringBuffer::Win32ResizeBuffer
46 168 [ 1] sxs!SxspExpandRelativePathToFull
29 0 [ 2] sxs!CGenericStringBufferAccessor::Attach
47 197 [ 1] sxs!SxspExpandRelativePathToFull
4 0 [ 2] sxs!CFrame::ClearLastError
52 201 [ 1] sxs!SxspExpandRelativePathToFull
13 0 [ 2] kernel32!GetFullPathNameW
64 214 [ 1] sxs!SxspExpandRelativePathToFull
9 0 [ 2] sxs!CFnTracerWin32::~CFnTracerWin32
66 223 [ 1] sxs!SxspExpandRelativePathToFull
20 0 [ 2] sxs!CGenericStringBufferAccessor::~CGenericStringBufferAccessor
72 243 [ 1] sxs!SxspExpandRelativePathToFull
797 424 [ 0] sxs!SxsInstallW
4 0 [ 1] sxs!CFrame::ClearLastError
800 428 [ 0] sxs!SxsInstallW
28 0 [ 1] sxs!SxspGetRemoteStore
5 0 [ 2] sxs!SxspEnsureUserIsAdmin
34 5 [ 1] sxs!SxspGetRemoteStore
9 0 [ 2] kernel32!LoadLibraryW
40 14 [ 1] sxs!SxspGetRemoteStore
18 0 [ 2] ShimEng!StubGetProcAddress
50 32 [ 1] sxs!SxspGetRemoteStore
24 0 [ 2] ole32!CoCreateInstance
>> No match on ret
24 0 [ 2] ole32!CoCreateInstance
5 0 [ 2] RPCRT4!NdrpGetRpcHelper
>> No match on ret
5 0 [ 2] RPCRT4!NdrpGetRpcHelper
17 0 [ 2] RPCRT4!NdrpGetIIDFromBuffer
>> No match on ret
17 0 [ 2] RPCRT4!NdrpGetIIDFromBuffer
71 0 [ 2] RPCRT4!NdrpInterfacePointerUnmarshall
>> No match on ret
71 0 [ 2] RPCRT4!NdrpInterfacePointerUnmarshall
11 0 [ 2] RPCRT4!NdrpPointerUnmarshall
>> No match on ret
11 0 [ 2] RPCRT4!NdrpPointerUnmarshall
4 0 [ 2] RPCRT4!NdrPointerUnmarshall
>> No match on ret
4 0 [ 2] RPCRT4!NdrPointerUnmarshall
19 0 [ 2] RPCRT4!NdrpPointerUnmarshall
>> No match on ret
19 0 [ 2] RPCRT4!NdrpPointerUnmarshall
4 0 [ 2] RPCRT4!NdrPointerUnmarshall
>> No match on ret
4 0 [ 2] RPCRT4!NdrPointerUnmarshall
65 0 [ 2] RPCRT4!NdrpClientUnMarshal
>> No match on ret
65 0 [ 2] RPCRT4!NdrpClientUnMarshal
16 0 [ 2] RPCRT4!NdrClientCall2
>> No match on ret
16 0 [ 2] RPCRT4!NdrClientCall2
8 0 [ 2] RPCRT4!ObjectStublessClient
>> No match on ret
8 0 [ 2] RPCRT4!ObjectStublessClient
4 0 [ 2] RPCRT4!ObjectStubless
65 0 [ 2] ole32!CRpcResolver::CreateInstance
>> No match on ret
65 0 [ 2] ole32!CRpcResolver::CreateInstance
10 0 [ 2] ole32!CClientContextActivator::CreateInstance
>> No match on ret
10 0 [ 2] ole32!CClientContextActivator::CreateInstance
8 0 [ 2] ole32!ActivationPropertiesIn::DelegateCreateInstance
>> No match on ret
8 0 [ 2] ole32!ActivationPropertiesIn::DelegateCreateInstance
53 0 [ 2] ole32!ICoCreateInstanceEx
>> No match on ret
53 0 [ 2] ole32!ICoCreateInstanceEx
21 0 [ 2] ole32!CComActivator::DoCreateInstance
>> No match on ret
21 0 [ 2] ole32!CComActivator::DoCreateInstance
2 0 [ 2] ole32!CoCreateInstanceEx
>> No match on ret
2 0 [ 2] ole32!CoCreateInstanceEx
5 0 [ 2] ole32!CoCreateInstance
62 444 [ 1] sxs!SxspGetRemoteStore
34 0 [ 2] ole32!CStdIdentity::CInternalUnk::QueryInterface
>> No match on ret
34 0 [ 2] ole32!CStdIdentity::CInternalUnk::QueryInterface
14 0 [ 2] ole32!CreateIdentityHandler
>> No match on ret
14 0 [ 2] ole32!CreateIdentityHandler
21 0 [ 2] ole32!UnmarshalInternalObjRef
>> No match on ret
21 0 [ 2] ole32!UnmarshalInternalObjRef
27 0 [ 2] ole32!OXIDEntry::UnmarshalRemUnk
>> No match on ret
27 0 [ 2] ole32!OXIDEntry::UnmarshalRemUnk
18 0 [ 2] ole32!OXIDEntry::MakeRemUnk
>> No match on ret
18 0 [ 2] ole32!OXIDEntry::MakeRemUnk
7 0 [ 2] ole32!OXIDEntry::GetRemUnk
>> No match on ret
7 0 [ 2] ole32!OXIDEntry::GetRemUnk
2 0 [ 2] ole32!CStdMarshal::GetSecureRemUnk
>> No match on ret
2 0 [ 2] ole32!CStdMarshal::GetSecureRemUnk
23 0 [ 2] ole32!CStdMarshal::Begin_RemQIAndUnmarshal1
>> No match on ret
23 0 [ 2] ole32!CStdMarshal::Begin_RemQIAndUnmarshal1
5 0 [ 2] ole32!CStdMarshal::Begin_QueryRemoteInterfaces
>> No match on ret
5 0 [ 2] ole32!CStdMarshal::Begin_QueryRemoteInterfaces
ModLoad: 741e0000 741ea000 C:\Windows\system32\sxsstore.dll

Woah. That’s verbose. There’s a lot of noise in this trace, but the SxspGetRemoteStore function draws attention and it is obvious from all the OLE32 invocations later on that COM is at work here. Examining the sxs!SxspGetRemoteStore function reveals it instantiates the COM object identified by CLSID_SxsStore (left as an exercise for the reader).

Let’s have a look at the object’s registration information. First, extract the CLSID:

0:002> x sxs!CLSID_SxsStore
7554c454 sxs!CLSID_SxsStore =
0:002> dt nt!_GUID 7554c454
ntdll!_GUID
{3c6859ce-230b-48a4-be6c-932c0c202048}
+0x000 Data1 : 0x3c6859ce
+0x004 Data2 : 0x230b
+0x006 Data3 : 0x48a4
+0x008 Data4 : [8] "???"

Now, we’ll use the command-line to see what’s special about this object’s registration:

C:\Users\User\Desktop>reg query HKCR\CLSID\{3c6859ce-230b-48a4-be6c-932c0c202048
} /s


HKEY_CLASSES_ROOT\CLSID\{3c6859ce-230b-48a4-be6c-932c0c202048}
(Default) REG_SZ Sxs Store Class
AppID REG_SZ {752073A2-23F2-4396-85F0-8FDB879ED0ED}

HKEY_CLASSES_ROOT\CLSID\{3c6859ce-230b-48a4-be6c-932c0c202048}\LocalServer32
(Default) REG_EXPAND_SZ %systemroot%\servicing\TrustedInstaller.exe
ThreadingModel REG_SZ Both

OK, that explains it. The SxS API asks an out-of-process COM server running in the context of the TrustedInstaller service to do its bidding, explaining how things work despite the restrictive ACL on the store.

Hope you enjoyed that digression, but now back to the original business at hand. The installation process works just fine on Vista, but the plot thickens when we examine the uninstallation process.

Strangely and contrary to documentation, UninstallAssembly always returns success but with a disposition value of 0 on Vista, and the assembly files remain in place in the WinSxS store no matter what. The bottom line – if you use this approach to deploy the libraries to a Vista system, you may leave behind unused assembly files after your application is uninstalled, cluttering the user’s system. Take this to heart when considering whether this approach and the avoidance of an MSI installer is appropriate for your scenario.

Both the issue of Visual C++ 9.0 assembly deployment using the SxS API and the weird referencing behavior encounter during assembly uninstallation on Vista remain, as of yet, unresolved issues. If anyone is game for figuring those out, I’d be glad to hear about it.

Microsoft publishes dozens of its network protocol specifications on MSDN

Microsoft made a big announcement today about having a new policy of promoting interoperability with its major products, citing modern needs, etc. If you ask me, the need for interoperability today is not much greater than it was a few years ago and this policy shift is way overdue. Along with the announcement which made for an amusing assortment of corporate-speak, Microsoft made the operative move of immediately publishing dozens of network protocol specifications on the MSDN Library. Their index can be found here. Apparently, documentation for things other than protocols (i.e., APIs) is forthcoming.

Having spent a few minutes going over some of these specifications, I have several observations to make:

  • Many of these specifications have been updated multiple times during the past year or so. Unlike Microsoft’s forgotten Internet Draft for the DCOM protocol from the late 1990s, finally we see up-to-date specifications for a change. I hope with the wide availability at a high-profile location like the MSDN Library, these contemporary specs will keep getting the love they need and could be relied upon to reflect the current Microsoft implementations.
  • Nearly every network service included with Microsoft Windows appears to be documented.
  • The detailed specifications are a gold mine to anyone seeking an under-the-hood glimpse of the internals of Microsoft’s network services. I was particularly thrilled, as can be expected, to encounter up to date descriptions of the extensions Microsoft made to the DCE RPC protocol, the DCOM network protocol and even how COM+ (MSDTC) implements network transactions over the prior.
  • The specifications are coherent in the sense that each makes appropriate references to related protocols. e.g., the COM+ specification references the DCOM specification, which references the RPC extensions specification. Even third-party references are made, e.g. to the “Open” Group’s DCE RPC 1.1 specification. (Open in quotes since, ironically, while I could readily download a protocol specification PDF from Microsoft’s MSDN with no intrusion, the so-called “Open” Group required compulsory registration for the free download, which seems to have nothing in for me except the prospect of future spam…)
  • I did not tolerate enough of the corporate speak in the press release to understand the legal status of the document release, but I hope it such that will allow popular open source diagnostic tools such as Wireshark to provide detailed, complete and accurate diagnostic information about these protocols.
  • The specifications tend to read more “official” than “practical.” In other words, they are more like an ISO standard than an IETF RFC. There’s hardly introductory text describing the protocols in context but rather really long glossaries you have to skim over to “get to the good stuff.” While raw technical descriptions are important, one has to question Microsoft’s true commitment to the promotion of interoperability given this state of affairs. Perhaps with their now altered target audience, we shall see improvements in this department in the not so distant future?
  • Some network protocols (Exchange, SQL Server) are not yet available, but are scheduled to be released sooner rather than later. In particular, I consider the publishing of the Exchange protocols as crucial to the promotion of interoperability in the groupware realm.

So what’s your favorite Microsoft network protocol? :-)

A lightweight approach for exposing C++ objects to a hosted Active Scripting engine

Microsoft’s Active Scripting architecture allows application developers to host the same implementations of the JScript and VBScript scripting languages used by Internet Explorer for scripts in HTML pages, Active Server Pages (the old, pre-.NET implementation) for server-side dynamic content or the Windows Scripting Host for independent scripts. Additionally, third party scripting engines can and have been developed, for Python, Perl and other interpreted languages.

Hosting a scripting engine involves implementing the IActiveScriptSite interface, providing a method to pass script code to the IActiveScript and IActiveScriptParse interfaces and is extensively documented in the literature. Therefore, I shall not discuss the mechanics of hosting itself and will elaborate only on the topic of exposing objects from the host to the engine.

Enabling scripting in your application only adds value over the external Windows Script Host if you expose unique, internal application functionality to the hosted scripts. If your application already exposes its functionality as COM automation objects to automation controllers that can be used out-of-process, there isn’t much point in hosting. However, if your application is document-oriented, for example, providing scripts with access to the document context can be very useful to your users.

A scripting host can make its object model available to hosted scripts by providing the engine with an IDispatch interface for each object it wants to make available. This interface is the foundation of OLE automation and is used by the scripting languages for late binding.

Since the IDispatch interface is basically a rather raw reflection mechanism, implementing it from scratch for a moderately complex object is tedious and error-prone.

If your application already implements COM objects regardless of scripting, it probably already makes use of a framework for doing so, be it ATL, MFC or the CLR. In that case, you have already paid the framework tax and implementing another interface is no challenge. Specifically, ATL offers the convenient IDispatchImpl class for implementing dual interfaces while the CLR makes it ridiculously simple to implement dispatch interfaces (by default, a .NET class is also a COM dispatch object).

However, a dependency on the CLR might not be a welcome requirement. Similarly, complicating a substantial existing code base with the tedium of COM class registration is an adventure that may not be suitable for the faint of heart. If you do not wish to expose automation objects to external clients like WSH, you have no need or desire to modify the registry and maintain that information across installations, uninstallations, upgrades and the like.

However, both ATL and MFC do not go to any reasonable lengths to facilitate the implementation of internal, unregistered COM objects. The IDispatchImpl class requires that you provide it with type information for your dual interface, but ATL’s only ITypeInfo wrapper, CComTypeInfoHolder, is oriented towards retrieving that from a type library residing in a file, either an independent .TLB or an embedded resource in your .EXE or .DLL file. This means that for exposing an object, you need to describe it in IDL, have your build process generate a .TLB for it with MIDL and possibly embed it as a resource using RC. At run time, you need to take care of the logistics of interface and type library registration. All of this for what you only want as internal functionality.

Apart from being tedious, that approach is also characterized by being rigid and static. Manipulating your exposed objects by making runtime decisions that could change the type information does not go well with them being static embedded resource entities.

I considered what would it to take to come up with binary type information from a source that isn’t a file or a resource. At first glance, the LoadTypeLib API is definitely file-oriented. However, a light bulb turned on in my head when I noticed that if the file name given does not exist, the string is treated as a moniker. I was hoping I could generate binary type information in .TLB format from IDL, store it in a flexible manner and provide LoadTypeLib with a moniker to the type information. I then paused as I realized there was an unanswered question – “a moniker to what?”. As is not uncommon in Microsoft’s documentation, elaboration on this point was scarce. I later found this newsgroup post on the matter. The original poster had the same question as mine and the reply pointed me in the right direction.

Although the responder was incorrect in assuming the pointer moniker implementation actually implemented IMoniker::GetDisplayName, a deficiency for which I can find no excuse, the OBJREF moniker provides a suitable alternative. The OBJREF moniker is a superset of the pointer moniker that supports out-of-process references, although no such functionality is required by me for this purpose, just getting a display name to feed LoadTypeLib.

I promptly implemented a skeleton IUnknown that would simply print what interface was requested on every call to QueryInterface and then return E_NOINTERFACE. I created an OBJREF moniker for this IUnknown implementation and supplied LoadTypeLib with the moniker’s display name. I figured this way, I would figure out what LoadTypeLib is expecting the supplied object to implement as an alternative to being given a file name.

I was disappointed when I saw what happened next – LoadTypeLib was asking my object for an ITypeLib implementation, and nothing else. This basically means that LoadTypeLib’s moniker support is completely useless – it returns an ITypeLib for an ITypeLib you already have.

My next attempt to tap into the existing binary type information parser involved writing a test program that called LoadTypeLib on a .TLB file for the purpose of finding if it loaded the information to memory and then promptly used intermediate functionality on the in-memory data that was also accessible to me. I examined the type library loader’s high level flow using Windbg:
0:000> bp oleaut32!LoadTypeLib
0:000> g
Breakpoint 2 hit
eax=0012ff00 ebx=7ffda000 ecx=81818d85 edx=10313d00 esi=0012fdc8 edi=0012ff5c
eip=771279e5 esp=0012fdbc ebp=0012ff68 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
OLEAUT32!LoadTypeLib:
771279e5 8bff mov edi,edi
0:000> wt -m oleaut32 -l 2 -ns
Tracing OLEAUT32!LoadTypeLib to return address 004117d0
7 0 [ 0] OLEAUT32!LoadTypeLib
43 0 [ 1] OLEAUT32!LoadTypeLibEx
16 0 [ 2] OLEAUT32!InitLoadInfo
44 16 [ 1] OLEAUT32!LoadTypeLibEx
65 0 [ 2] OLEAUT32!InitAppData
51 81 [ 1] OLEAUT32!LoadTypeLibEx
9 0 [ 2] OLEAUT32!LHashValOfNameSys
58 90 [ 1] OLEAUT32!LoadTypeLibEx
25 0 [ 2] OLEAUT32!OLE_TYPEMGR::LookupTypeLib
66 115 [ 1] OLEAUT32!LoadTypeLibEx
46 0 [ 2] OLEAUT32!FindTypeLib
72 161 [ 1] OLEAUT32!LoadTypeLibEx
25 0 [ 2] OLEAUT32!OLE_TYPEMGR::LookupTypeLib
89 186 [ 1] OLEAUT32!LoadTypeLibEx
185 0 [ 2] OLEAUT32!GetOffsetOfResource
101 371 [ 1] OLEAUT32!LoadTypeLibEx
79 0 [ 2] OLEAUT32!CreateFileLockBytesOnHFILE
117 450 [ 1] OLEAUT32!LoadTypeLibEx
23 0 [ 2] OLEAUT32!LoadTypeLib2LockBytes
126 473 [ 1] OLEAUT32!LoadTypeLibEx
17 0 [ 2] OLEAUT32!FileLockBytesMemory::Release
133 490 [ 1] OLEAUT32!LoadTypeLibEx
156 0 [ 2] OLEAUT32!OLE_TYPEMGR::TypeLibLoaded
145 646 [ 1] OLEAUT32!LoadTypeLibEx
15 0 [ 2] OLEAUT32!UninitLoadInfo
155 661 [ 1] OLEAUT32!LoadTypeLibEx
5 0 [ 2] OLEAUT32!__security_check_cookie
157 666 [ 1] OLEAUT32!LoadTypeLibEx
9 823 [ 0] OLEAUT32!LoadTypeLib

It was clear from the trace that LoadTypeLib created an ILockBytes over the .TLB file and promptly provided it to LoadTypeLib2LockBytes. Unfortunately, neither this internal function nor any other leading to its functionality is exported from the OLE automation library. The binary type information parser is not accessible externally for in-memory data. What was missing is that LoadTypeLib did not attempt to QueryInterface for ILockBytes when given a moniker, if ITypeLib is not implemented by the object directly. This approach, therefore, had to be scrapped.

I was hoping I could use MIDL to generate binary type information for me and the notion of implementing ITypeLib completely on my own for in-memory representation seemed like a daunting task. If this is the trade-off, surely reverting to ATL and dealing with the evils of registration would be the better approach?

Not so fast. It turns out there is another approach for getting the type information you need for exposing your C++ object, without generating a full-fledged type library or implementing your own type information provider. The marvelous CreateDispTypeInfo API. You provide it with a INTERFACEDATA structure describing your object and get the type information you need. Combined with CreateStdDispatch, it becomes easy to expose simple objects to automation.

Reviewing the sample included in the MSDN documentation of CreateDispTypeInfo is indicative of the sorry state of affairs in Microsoft’s documentation group, seeing as it is quite incomplete and makes use of macros like METHOD0, METHOD1 and PROPERTY, which are nowhere to be found and must have existed in whatever project the sample code has been copy-pasted from. Detailed discussion of the function’s usage is scarce, but existent, on the Web, primarily in newsgroups. Allow me to illustrate with an example. Consider the following hypothetical C++ class one wishes to expose to scripting:

class MyObject
{
public:
virtual void __stdcall f(int i);
virtual BOOL __stdcall g(float f);
};

As is evident, this class is pretty plain and certainly has nothing to do with COM. Just the sort of class your existing application with no use of COM might have. To expose it, we need to fill some descriptor structures so type information can be generated for it. We add a few static members:
class MyObject
{
public:
virtual void __stdcall f(int i);
virtual BOOL __stdcall g(float f);
static PARAMDATA f_paramData;
static PARAMDATA g_paramData;
static METHODDATA methodData[];
static INTERFACEDATA interfaceData;
};

Let’s fill those babies up:

PARAMDATA MyObject::f_paramData = {
OLESTR("i"), VT_I4
};
PARAMDATA MyObject::g_paramData = {
OLESTR("f"), VT_R4
};
METHODDATA MyObject::methodData[] = {
{ OLESTR("f"), &MyObject::f_paramData, 1, 0, CC_STDCALL, 1, DISPATCH_METHOD, VT_EMPTY },
{ OLESTR("g"), &MyObject::g_paramData, 2, 1, CC_STDCALL, 1, DISPATCH_METHOD, VT_BOOL }
};
INTERFACEDATA MyObject::interfaceData = {
MyObject::methodData,
sizeof(MyObject::methodData) / sizeof(METHODDATA)
};

For each method of our object, we describe the method’s parameters, giving them name and type in a PARAMDATA structure. We then fill a method table for the object with complete information, including the parameter data, return value type, calling convention and such. The INTERFACEDATA wraps the whole thing in a nice little package to feed CreateDispTypeInfo with.

We now proceed to create an automation wrapper for our pure object:

CComPtr<ITypeInfo> pMyobjTypeInfo;
hr = CreateDispTypeInfo(
&MyObject::interfaceData,
LOCALE_SYSTEM_DEFAULT,
&pMyobjTypeInfo);
CComPtr<IUnknown> pMyobj;
hr = CreateStdDispatch(NULL, &myobj, pMyobjTypeInfo, &pMyobj);

At this point, pMyobj is a full fledged COM object implementing IDispatch and wrapping the MyObject class instance myobj, which had no knowledge of COM originally and now bundles tables describing its methods.

The scripting site’s implementation of IActiveScriptSite::GetItemInfo should now return pMyObj, the object’s IUnknown and potential IDispatch, and pMyobjTypeInfo, its ITypeInfo, when requested to do so by the hosted scripting engine. We register the object we wish to expose with the engine:

hr = pActiveScriptEngine->AddNamedItem(
L"myobject",
SCRIPTITEM_ISSOURCE | SCRIPTITEM_ISVISIBLE | SCRIPTITEM_ISPERSISTENT);

If our GetItemInfo does its job when asked for “myobject”, assuming we host the JScript engine, we can now do things like
myobject.f();
var b = myobject.g(0.4);

in script code running in our host.

I find this approach to automation object exposition attractive because it is non-intrusive. If desired, the tables describing the exposed class need not be members of the actual class, but can be stored separately. Notice that you do not even have to generate a CLSID for the exposed class. It is also possible to expose only a certain subset of class methods to the scripting environment.

However, maintaining the type information tables can become a clear scalability issue with more complicated classes. For these cases, rolling an automatic code generation solution may be desired, since MIDL’s functionality in this department cannot be reused. The class and its methods could be described in an XML file, and a tool iterating over its DOM or even an XSLT transformation could generate a C++ header file from the description, complete with the INTERFACEDATA information. This would ensure the method tables and the actual method signatures remain synchronized over the extended life-time of the class.

Finally, a pointer to some tips and a few words of caution to those interested in this solution, this newsgroup post. Let me add to it that CreateDispTypeInfo only seems to work correctly with the __stdcall calling convention, even if you specify CC_CDECL in your type information. Using CC_STDCALL and making sure your classes use __stdcall made everything work. Before that, symptoms included method arguments receiving seemingly random values when called by the scripting engine, due to stack imbalance.

Hey, I said the approach is lightweight, not the post ;-)

WinHttpRequest performance woes

As I mentioned the other day, the WinHttpRequest automation object provides access from scripting to the WinHTTP client library. As I was looking into it for retrieving relatively large files for a testing scenario (e.g., the Sun JDK’s 80MB installation file) I couldn’t help but notice a clear flaw in its interface. Access to the response body for a request is provided by the ResponseText property, which simply provides a string containing the response. There’s also a similar ResponseBody property which is an array but otherwise identical. Unfortunately, the ResponseStream property, which returns an IStream reference, is not usable from scripting, which only supports dispinterfaces.

This is a problem because receiving large responses from the server requires maintaining their entire body in memory. Clearly, using ResponseText to download a 4 GB DVD ISO isn’t a great idea. So I figured as long as one keeps to reasonable transfer sizes, not massively ballooning the process working set, one should be fine.

It was only a short while later when a friend let me know about a problem he had in his own script making server requests. He was attempting to retrieve a file a few dozens megabytes in size from a web server, but instead of the retrieval completing promptly as expected, the Windows Script Host process became a CPU hog and continuously crept up in memory usage.

As we were trying to reproduce the problem in a as simple as possible environment, we saw that simply downloading a large file seemed to work promptly and as expected. Granted, memory usage was significant due to the property issue mentioned above, but the system had plenty of RAM and this was not an issue.

After some investigation the difference in the problematic scenario was identified. The transfer of the file was provided by a server generated dynamic page, where server code would dynamically write the contents of the requested file into the response stream.

Attaching the debugger to the running WSH process and examining the stack provides a big clue as to what is going on:
0:006> ~0 k 30
ChildEBP RetAddr
0013f004 7c81248c ntdll!RtlReAllocateHeap+0xc36
0013f05c 77506778 kernel32!GlobalReAlloc+0x17a
0013f078 77506802 ole32!CMemStm::SetSize+0x37
0013f09c 4d527444 ole32!CMemStm::Write+0x76
0013f0d8 4d527ca8 WINHTTP!CHttpRequest::ReadResponse+0x12b
0013f108 4d5248d9 WINHTTP!CHttpRequest::Send+0x1b9
0013f170 6fe9fcf0 WINHTTP!CHttpRequest::Invoke+0x359
0013f1ac 6fe9fc5d jscript!IDispatchInvoke2+0xb5
0013f1e8 6fe9fd78 jscript!IDispatchInvoke+0x59
0013f25c 6fea6c3c jscript!InvokeDispatch+0x90
0013f2a0 6fe9fab8 jscript!VAR::InvokeByName+0x1c2
0013f2e0 6fe9efea jscript!VAR::InvokeDispName+0x43
0013f304 6fea6ff4 jscript!VAR::InvokeByDispID+0xfd
0013f3bc 6fea165d jscript!CScriptRuntime::Run+0x16bd
0013f3d4 6fea1793 jscript!ScrFncObj::Call+0x8d
0013f444 6fe8da72 jscript!CSession::Execute+0xa7
0013f494 6fe8beba jscript!COleScript::ExecutePendingScripts+0x147
0013f4b0 0100220a jscript!COleScript::SetScriptState+0xf1
0013f4bc 0100217d cscript!CScriptingEngine::Run+0xb
0013f4d0 01001f34 cscript!CHost::RunStandardScript+0x85
0013f708 010027fc cscript!CHost::Execute+0x1f0
0013fcac 010024de cscript!CHost::Main+0x385
0013ff50 010025e6 cscript!main+0x6d
0013ffc0 7c816fd7 cscript!_mainCRTStartup+0xc4
0013fff0 00000000 kernel32!BaseProcessStart+0x23

As the WinHttpRequest object is reading the response from the server side script, it is naively writing it to an internal OLE memory stream (presumably the product of CreateStreamOnHGlobal). As more and more data is received, the memory stream is resized to accommodate it. We can track the stream’s growth pattern with a suitable breakpoint on the SetSize method. It is documented as receiving a ULARGE_INTEGER, an unsigned 64-bit integer, specifying the stream’s desired size. Therefore, we expect the argument to be 8 bytes above the position of the stack pointer when we enter the function, on x86. A 64-bit integer is also known as a quadword. The debugger command is hence:
bp ole32!CMemStm::SetSize ".printf \"SetSize called with size %d\", qwo(@esp + 8); .echo; gc"

If we promptly resume execution and let the script run for a while we see something similar to:
SetSize called with size 10650474
SetSize called with size 10658666
SetSize called with size 10666858
SetSize called with size 10675050
SetSize called with size 10683242

And so on… some quick arithmetic indicates that the numerous SetSize calls are apart by 8,192 bytes.

Resizing the stream by 8KB at a time for a file dozens of megabytes in size is clearly not a winning strategy. As can be seen in the initial stack trace, this results in a heap reallocation frenzy. As contiguous room for block resize on the heap is exhausted, the stream resize must be satisfied by allocating a whole new block and moving all the previous stream data there. As the stream grows, the whole process slows down exponentially. Since the block in question is a big one, it resides in a “virtual block” on the heap (a block of address space provided by VirtualAlloc rather than a smaller one from the heap’s free lists) and is presumably moved whenever there is not enough contiguous address space at the virtual block’s present position to extend it.

Let us examine the heap to confirm this. We can suspend execution at ole32!CMemStm::SetSize and proceed to kernel32!GlobalReAlloc. The first parameter to the latter is the handle to the heap allocation:
Breakpoint 0 hit
eax=00000000 ebx=00000000 ecx=006f5fa4 edx=774eee94 esi=00184790 edi=00002000
eip=77506741 esp=0013f07c ebp=0013f09c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ole32!CMemStm::SetSize:
77506741 8bff mov edi,edi
0:000> bp kernel32!GlobalReAlloc
0:000> g
Breakpoint 1 hit
eax=00184d98 ebx=00000000 ecx=006f5fa4 edx=00184a14 esi=00184790 edi=006f5fa4
eip=7c8123b9 esp=0013f060 ebp=0013f078 iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000206
kernel32!GlobalReAlloc:
7c8123b9 6a24 push 24h
0:000> kb 1
ChildEBP RetAddr Args to Child
0013f05c 77506778 008e000c 006f5fa4 00002002 kernel32!GlobalReAlloc

If we proceed to examine the process default heap (at 0x00150000 in my case) and its virtual allocation list (we can ignore allocations from the linked lists below, they are for much smaller allocations) we can see the interesting block:
0:000> !heap -a 00150000
Index Address Name Debugging options enabled
1: 00150000
Segment at 00150000 to 00250000 (00035000 bytes committed)
Segment at 016D0000 to 017D0000 (00001000 bytes committed)
Flags: 00000002
ForceFlags: 00000000
Granularity: 8 bytes
Segment Reserve: 00200000
Segment Commit: 00002000
DeCommit Block Thres:00000200
DeCommit Total Thres:00002000
Total Free Size: 000001ee
Max. Allocation Size:7ffdefff
Lock Variable at: 00150608
Next TagIndex: 0000
Maximum TagIndex: 0000
Tag Entries: 00000000
PsuedoTag Entries: 00000000
Virtual Alloc List: 00150050
017d0000: 006f4000 [6f3fa4] - busy (2b) (Handle 008e000c), user flags (1)
...

If we let execution proceed for a while and examine the heap again, we see something like:
0:000> !heap -a 00150000
...
Virtual Alloc List: 00150050
02080000: 008ac000 [8abfa4] - busy (2b) (Handle 008e000c), user flags (1)
...

Notice the virtual allocation for the global heap handle has moved elsewhere in the interim. Originally at virtual address 0x017d0000, it is now at 0x02080000. Further examination shows that this happens repeatedly. In fact, examining the position of the block between calls of GlobalReAlloc shows how it alternates between a small set of virtual addresses, back and forth, as its size is modified. For a reason unclear without diving into deeper heap and virtual allocation internals, the virtual address block seems to change its location even when it can be extended to the new size at its present position, unlike what I speculated above.

Examination of the server response headers for the static page case and the dynamic case provide an insight to the difference in performance behavior. In the static case, the server, knowing in advance the size of the response, can provide the client with a Content-Length header. The header can be used by the client to allocate a block, huge as it may, in one swoop. Despite the slight memory pressure, this happens practically instantaneously.

In the dynamic case, the server cannot guess how much output the page is going to feel like providing. No Content-Length header is emitted, and the client’s naive stream memory allocation strategy comes to light and kills performance.

While WinHttpRequest’s naive reliance on COM’s memory stream allocation strategy is primarily at fault for this issue, the heap manager’s own deficiencies should not be overlooked. I believe its behavior for large block reallocations (virtual blocks) could be somewhat improved to mitigate inefficient clients.

This issue can be mitigated at the server side by ensuring a Content-Length header is always provided in responses. This can be tricky to calculate and I haven’t looked into whether lying (i.e., providing a Content-Length longer than the actual response just to get the client to preallocate memory) works correctly.

While I was looking into this issue, I was hoping to take advantage of Visual Studio 2005 Team Edition’s integrated profiler. I assumed it would be ideal for detecting and clearly illustrating an obvious CPU hog such as this case. However, after short examination, I must express my disappointment.

Visual Studio 2005’s profiler, like most other profilers, has two operation modes. Sampling and Instrumentation. Sampling means examining thread stacks at an interval and noting current execution state, basically. Instrumentation means binary patching of functions with code that notes execution flow has reached them. Instrumentation is therefore a means of “zooming in” on functions of interest, seeing as it is both more precise and has greater overhead.

I initially tried the Sampling mode. Examining its output in the impressive GUI, Performance Explorer, I noticed the sample didn’t clearly capture what was going on. I figured by instrumenting the functions I was suspicious of (the heap reallocator, etc.) a clear graphical illustration would be possible.

It was then I discovered that the profiler’s Instrumentation mode requires full debugging information. Since I am profiling native code that is not my own running under WSH, clearly all I have is public PDBs. Instrumentation refused to work in this scenario, making it useless for these kinds of cases.

This is too bad because other profilers seem content with instrumenting just exports or functions as they are indicated in public PDBs. I hope to see this as a feature in future versions of the Visual Studio profiler, since I like the slick Performance Explorer.

For those of you implementing memory stream classes and the like, keep in mind that sometimes an imperialistic memory allocation strategy is a good thing.

Think before you act, allocate before you write.

No SxS love from the Windows Script Host?

I was automating a scenario with a WSH script the other day that required interaction with a web server. So naturally I figured I’d make use of the WinHttpRequest automation object which wraps the WinHTTP API.

Those familiar with WSH may share my great distaste for the fact that when it functions as an automation controller, the developer is expected to hard-code enumeration constants and the like (as “vars” in JScript or “Consts” in VBScript). I first encountered this ridiculous limitation when a friend showed me how he translated C# code that automated Microsoft Word to VBScript, and had to look up the various constants by hand, with Visual Basic 6’s Object Browser, which functions as a convenient type library viewer. By default, the script engine only uses the automation object’s IDispatch interface, leaving the chore of constant resolution to the caller.

So I was relieved when I found out that .wsf files, which are XML files that wrap scripts executed by WSH, support referencing a type library for the purpose of making available the constants used with a controlled automation object. I was a little disappointed to find out about their not so ideal performance characteristics, but that was not problematic in my case.

I figured I’d introduce a reference of the following form to the script:
<reference object="WinHttp.WinHttpRequest.5.1"/>

Not all was well, however. After introducing the change above, I noticed my script had stopped working on one of the systems. Invocation of the Windows Scripting Host failed, with WSH claiming to be unable to resolve the reference to the specified ProgID.

After looking into it I figured out the problem with that specific system was that it was running Windows Server 2003 rather than Windows XP. It seemed strange the newer Windows Server 2003 would have a regression like that. I continued investigating.

The first clue was that winhttp.dll, the DLL implementing the WinHTTP API, was MIA from Windows Server 2003’s system32 directory. Surely the API was not missing from the OS, MSDN clearly documents its presence. It was indeed there, albeit in a modified form: a native side by side assembly.

OK, so winhttp.dll is there, in an oddly named subdirectory somewhere in the winsxs store instead of system32. Still, I recalled from my previous interaction with SxS that side by side assemblies could expose COM objects to their clients. Examination of the manifest file for WinHTTP in Windows Server 2003’s winsxs store revealed that it was indeed doing so.

Microsoft documents that users of the flat WinHTTP C API under Windows Server 2003 should add winhttp.dll as a dependent assembly to the activation context of the client application, but this approach seemed inappropriate to me in the context of the WinHttpRequest automation object, since clients activate it by ProgID or GUID and do not load winhttp.dll directly. Them being made aware of this relationship would be a serious breach of COM’s encapsulation.

I proceeded to write a test application in C++. It initialized COM and proceeded to call CLSIDFromProgID to translate “WinHttp.WinHttpRequest.5.1” to a GUID. Given success of this translation, it would call CoCreateInstance on returned GUID and if that worked out, QueryInterface for IDispatch and for IWinHttpRequest (defined in the Windows SDK’s httprequest.idl).

To my, I must admit, great surprise, the test application worked. The first surprising thing was that CLSIDFromProgID returned successfully, even though I specified a ProgID exposed by a SxS assembly. The ProgID was clearly absent from the HKEY_CLASSES_ROOT registry key in Windows Server 2003, in contrast to its presence there in Windows XP. Only if ole32.dll, the COM runtime, had specific knowledge of SxS and ability to perform a lookup in the winsxs store, would such a request be serviced successfully, I figured. However, no mention of this functionality could be found directly in CLSIDFromProgID’s documentation.

I was even more surprised that the CLSID returned by CLSIDFromProgID as the result of the lookup was NOT the CLSID of winhttp.dll! I couldn’t find the returned CLSID in the registry. However, when I promptly invoked CoCreateInstance, not only the activation request succeeded, I actually saw a Module Load event for winhttp.dll from the winsxs store in the debugger. I assume that the returned CLSID is part of some COM SxS integration magic.

OK, so my poor man’s automation controller implemented in C++ could obviously activate the WinHttpRequest object even in Windows Server 2003 with no knowledge of its new SxS semantics. It seemed odd that my script would fail to do same, since I assumed similar mechanics were behind its resolution process for locating the type library.

The next thing I did was to try and run my script on Windows Vista. I figured the change to WinHTTP making it a SxS assembly introduced in Windows Server 2003 was incorporated into Microsoft’s latest OS, as well. Continuing the previous chain of surprises, the script suddenly worked.

The first difference between Windows Server 2003 and Windows Vista that I observed was that Windows Script Host was updated to version 5.7 in the new OS. My first theory was that the new WSH had corrected whatever implementation issue that prevented WSH 5.6 from locating SxS type libraries.

I looked it up and found out that only days earlier Microsoft had actually made a release of the new Windows Script Host 5.7 to down-level platforms. Untypically for Microsoft nowadays, they even made a release for Windows 2000. So now I had a chance to test my theory. I installed WSH 5.7 on the Windows Server 2003 system and reran the script. In yet another surprise, it didn’t work, the type library reference giving the same error as before. It seems my instincts are really off about all of this.

So there must be a different reason for the different behavior of Windows Server 2003 and Windows Vista. After examining the Vista system, it appeared the whole thing was a lot simpler than I had originally thought. Windows Vista was a strange hybrid of the Windows XP and Windows Server 2003 behaviors, with winhttp.dll being present both as a SxS assembly in its winsxs store and as a regular DLL in system32. Indeed, examination of HKEY_CLASSES_ROOT in the Vista registry resulted in the discovery of plain old ProgID registration for the non-SxS winhttp.dll. This is most likely the reason that the type library lookup succeeds in the Windows Vista system.

With these details at hand, I was finally able to find a discussion of this issue in a newsgroup. In that newsgroup thread, Microsoft’s Biao Wang acknowledges WSH’s lack of support for SxS type library references. The thread being an old one, the possibility of a fix being introduced in Windows Server 2003 Service Pack 1 was mentioned. However, considering the issue presented itself on the Windows Server 2003 system that had Service Pack 2 installed and that the latest WSH 5.7 still doesn’t support this down-level, it appears that the issue ended up remaining unresolved, for whatever consideration Microsoft had made on the matter.

The thread does mention a satisfactory workaround: reference the SxS type library by GUID and version instead of by object ProgID and it seems to work. I tried referencing the type library by GUID when the ProgID approach didn’t work on Windows Server 2003 originally, but that reference didn’t work either since I left out the “version” directive. Another happy ending.

Lovers of type library constant imports, rejoice!

Mysterious disappearing acts by samples from Microsoft SDKs

I was exploring use of the COM type library marshaller as an alternative to a MIDL-generated marshaller a while back (as mentioned on the bottom of my earlier COM post). The type library marshaller, like the rest of COM, is usually inclined towards registration (you specify the TypeLib for an interface in the registry to let COM know how to marshal it and specify the marshaller’s GUID as the ProxyStubClsid32 instead of some MIDL generated class in your custom DLL) – but can actually be used quite comfortably without it. In my search for a solution I stumbled upon Don Box’s old MSJ article that provides a rather exhaustive description of the whole deal. Evidently, the TypeLib entry in the registry is just an accessory and the interesting part is actually the rpcrt4.dll function duo, CreateProxyFromTypeInfo and CreateStubFromTypeInfo.

These APIs receive an ITypeLib reference and use the binary type information to synthesize an /Oicf style marshaller, dynamically. It is quite a feat. Consider the possibilities. An application could marshal arbitrary interface references and not be bound just to a known subset for which it has proxy/stub implementations statically linked. Microsoft recognized how messy linking a MIDL generated proxy/stub factory to a non-C++ project can be and included the Marshaler sample in the .NET 1.1 SDK. This is a C# sample that uses P/Invoke to the type library marshaller and thus avoids MIDL integration. When I looked over the sample, I was amazed to see it use the undocumented, underground CreateProxyFromTypeInfo and CreateStubFromTypeInfo APIs with not even a sign of apology.

As I later found out, I was simply fortunate to happen to be on a machine with the old .NET 1.1 SDK installed. The sample, along with others, has simply disappeared from the .NET 2.0 SDK and its successor, the Windows SDK. Unlike what one may expect, it hasn’t been replaced by anything similar that illustrates what it meant to.

Unfortunately, this is merely one case in a series of disappearing acts by Microsoft SDK samples. As I was exploring the use of RPC as an alternative to the more heavyweight COM, I looked for an SDK sample on the memory allocation and free semantics of [in, out] parameters in MIDL-generated RPC stubs. I found a list of the RPC samples in this MSDN page and figured the listed DYNOUT sample was exactly what I needed. Unfortunately, this time I was on a machine with only the latest Windows SDK installed. A search through Samples\NetDs\rpc quickly resulted in disappointment as the sample was obviously gone forever. This time, unlike the Marshaler sample case, there didn’t seem to be an obvious reason for the sample’s demise. I was able to scramble the answer for what I was looking for through a combination of trial-and-error and examination of those samples that did remain. It remains unclear why the now-defunct sample remains listed in MSDN.

Another trigger for filing a Missing Sample’s Report with the authorities was given to me the other day when a friend asked me about a WMI client sample. This one was a part of the Windows Server 2oo3 DDK but was, again, mysteriously absent from the new Windows Driver Kit. Considering the WDK and its huge DVD contains just about anything (so much that Microsoft decided to trim it down back to ~700MB by refactoring out the Windows Logo testing stuff, etc.) it seems bizarre that this sample of all things would be cut out.

If the motivation for the removal of these and other samples is some sort of quality bar (like the highly unwelcome removal of the trusty Dependency Walker from the Windows SDK because it doesn’t meet some sort of “quality” guideline) then it seems to me the cure is far worse than the disease. A poor sample is better than no sample at all, except perhaps when it comes to security, which doesn’t seem to be the case here. The disappearance of samples is a great disservice to users of the various SDKs.