A lightweight approach for exposing C++ objects to a hosted Active Scripting engine

Microsoft’s Active Scripting architecture allows application developers to host the same implementations of the JScript and VBScript scripting languages used by Internet Explorer for scripts in HTML pages, Active Server Pages (the old, pre-.NET implementation) for server-side dynamic content or the Windows Scripting Host for independent scripts. Additionally, third party scripting engines can and have been developed, for Python, Perl and other interpreted languages.

Hosting a scripting engine involves implementing the IActiveScriptSite interface, providing a method to pass script code to the IActiveScript and IActiveScriptParse interfaces and is extensively documented in the literature. Therefore, I shall not discuss the mechanics of hosting itself and will elaborate only on the topic of exposing objects from the host to the engine.

Enabling scripting in your application only adds value over the external Windows Script Host if you expose unique, internal application functionality to the hosted scripts. If your application already exposes its functionality as COM automation objects to automation controllers that can be used out-of-process, there isn’t much point in hosting. However, if your application is document-oriented, for example, providing scripts with access to the document context can be very useful to your users.

A scripting host can make its object model available to hosted scripts by providing the engine with an IDispatch interface for each object it wants to make available. This interface is the foundation of OLE automation and is used by the scripting languages for late binding.

Since the IDispatch interface is basically a rather raw reflection mechanism, implementing it from scratch for a moderately complex object is tedious and error-prone.

If your application already implements COM objects regardless of scripting, it probably already makes use of a framework for doing so, be it ATL, MFC or the CLR. In that case, you have already paid the framework tax and implementing another interface is no challenge. Specifically, ATL offers the convenient IDispatchImpl class for implementing dual interfaces while the CLR makes it ridiculously simple to implement dispatch interfaces (by default, a .NET class is also a COM dispatch object).

However, a dependency on the CLR might not be a welcome requirement. Similarly, complicating a substantial existing code base with the tedium of COM class registration is an adventure that may not be suitable for the faint of heart. If you do not wish to expose automation objects to external clients like WSH, you have no need or desire to modify the registry and maintain that information across installations, uninstallations, upgrades and the like.

However, both ATL and MFC do not go to any reasonable lengths to facilitate the implementation of internal, unregistered COM objects. The IDispatchImpl class requires that you provide it with type information for your dual interface, but ATL’s only ITypeInfo wrapper, CComTypeInfoHolder, is oriented towards retrieving that from a type library residing in a file, either an independent .TLB or an embedded resource in your .EXE or .DLL file. This means that for exposing an object, you need to describe it in IDL, have your build process generate a .TLB for it with MIDL and possibly embed it as a resource using RC. At run time, you need to take care of the logistics of interface and type library registration. All of this for what you only want as internal functionality.

Apart from being tedious, that approach is also characterized by being rigid and static. Manipulating your exposed objects by making runtime decisions that could change the type information does not go well with them being static embedded resource entities.

I considered what would it to take to come up with binary type information from a source that isn’t a file or a resource. At first glance, the LoadTypeLib API is definitely file-oriented. However, a light bulb turned on in my head when I noticed that if the file name given does not exist, the string is treated as a moniker. I was hoping I could generate binary type information in .TLB format from IDL, store it in a flexible manner and provide LoadTypeLib with a moniker to the type information. I then paused as I realized there was an unanswered question – “a moniker to what?”. As is not uncommon in Microsoft’s documentation, elaboration on this point was scarce. I later found this newsgroup post on the matter. The original poster had the same question as mine and the reply pointed me in the right direction.

Although the responder was incorrect in assuming the pointer moniker implementation actually implemented IMoniker::GetDisplayName, a deficiency for which I can find no excuse, the OBJREF moniker provides a suitable alternative. The OBJREF moniker is a superset of the pointer moniker that supports out-of-process references, although no such functionality is required by me for this purpose, just getting a display name to feed LoadTypeLib.

I promptly implemented a skeleton IUnknown that would simply print what interface was requested on every call to QueryInterface and then return E_NOINTERFACE. I created an OBJREF moniker for this IUnknown implementation and supplied LoadTypeLib with the moniker’s display name. I figured this way, I would figure out what LoadTypeLib is expecting the supplied object to implement as an alternative to being given a file name.

I was disappointed when I saw what happened next – LoadTypeLib was asking my object for an ITypeLib implementation, and nothing else. This basically means that LoadTypeLib’s moniker support is completely useless – it returns an ITypeLib for an ITypeLib you already have.

My next attempt to tap into the existing binary type information parser involved writing a test program that called LoadTypeLib on a .TLB file for the purpose of finding if it loaded the information to memory and then promptly used intermediate functionality on the in-memory data that was also accessible to me. I examined the type library loader’s high level flow using Windbg:
0:000> bp oleaut32!LoadTypeLib
0:000> g
Breakpoint 2 hit
eax=0012ff00 ebx=7ffda000 ecx=81818d85 edx=10313d00 esi=0012fdc8 edi=0012ff5c
eip=771279e5 esp=0012fdbc ebp=0012ff68 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
OLEAUT32!LoadTypeLib:
771279e5 8bff mov edi,edi
0:000> wt -m oleaut32 -l 2 -ns
Tracing OLEAUT32!LoadTypeLib to return address 004117d0
7 0 [ 0] OLEAUT32!LoadTypeLib
43 0 [ 1] OLEAUT32!LoadTypeLibEx
16 0 [ 2] OLEAUT32!InitLoadInfo
44 16 [ 1] OLEAUT32!LoadTypeLibEx
65 0 [ 2] OLEAUT32!InitAppData
51 81 [ 1] OLEAUT32!LoadTypeLibEx
9 0 [ 2] OLEAUT32!LHashValOfNameSys
58 90 [ 1] OLEAUT32!LoadTypeLibEx
25 0 [ 2] OLEAUT32!OLE_TYPEMGR::LookupTypeLib
66 115 [ 1] OLEAUT32!LoadTypeLibEx
46 0 [ 2] OLEAUT32!FindTypeLib
72 161 [ 1] OLEAUT32!LoadTypeLibEx
25 0 [ 2] OLEAUT32!OLE_TYPEMGR::LookupTypeLib
89 186 [ 1] OLEAUT32!LoadTypeLibEx
185 0 [ 2] OLEAUT32!GetOffsetOfResource
101 371 [ 1] OLEAUT32!LoadTypeLibEx
79 0 [ 2] OLEAUT32!CreateFileLockBytesOnHFILE
117 450 [ 1] OLEAUT32!LoadTypeLibEx
23 0 [ 2] OLEAUT32!LoadTypeLib2LockBytes
126 473 [ 1] OLEAUT32!LoadTypeLibEx
17 0 [ 2] OLEAUT32!FileLockBytesMemory::Release
133 490 [ 1] OLEAUT32!LoadTypeLibEx
156 0 [ 2] OLEAUT32!OLE_TYPEMGR::TypeLibLoaded
145 646 [ 1] OLEAUT32!LoadTypeLibEx
15 0 [ 2] OLEAUT32!UninitLoadInfo
155 661 [ 1] OLEAUT32!LoadTypeLibEx
5 0 [ 2] OLEAUT32!__security_check_cookie
157 666 [ 1] OLEAUT32!LoadTypeLibEx
9 823 [ 0] OLEAUT32!LoadTypeLib

It was clear from the trace that LoadTypeLib created an ILockBytes over the .TLB file and promptly provided it to LoadTypeLib2LockBytes. Unfortunately, neither this internal function nor any other leading to its functionality is exported from the OLE automation library. The binary type information parser is not accessible externally for in-memory data. What was missing is that LoadTypeLib did not attempt to QueryInterface for ILockBytes when given a moniker, if ITypeLib is not implemented by the object directly. This approach, therefore, had to be scrapped.

I was hoping I could use MIDL to generate binary type information for me and the notion of implementing ITypeLib completely on my own for in-memory representation seemed like a daunting task. If this is the trade-off, surely reverting to ATL and dealing with the evils of registration would be the better approach?

Not so fast. It turns out there is another approach for getting the type information you need for exposing your C++ object, without generating a full-fledged type library or implementing your own type information provider. The marvelous CreateDispTypeInfo API. You provide it with a INTERFACEDATA structure describing your object and get the type information you need. Combined with CreateStdDispatch, it becomes easy to expose simple objects to automation.

Reviewing the sample included in the MSDN documentation of CreateDispTypeInfo is indicative of the sorry state of affairs in Microsoft’s documentation group, seeing as it is quite incomplete and makes use of macros like METHOD0, METHOD1 and PROPERTY, which are nowhere to be found and must have existed in whatever project the sample code has been copy-pasted from. Detailed discussion of the function’s usage is scarce, but existent, on the Web, primarily in newsgroups. Allow me to illustrate with an example. Consider the following hypothetical C++ class one wishes to expose to scripting:

class MyObject
{
public:
virtual void __stdcall f(int i);
virtual BOOL __stdcall g(float f);
};

As is evident, this class is pretty plain and certainly has nothing to do with COM. Just the sort of class your existing application with no use of COM might have. To expose it, we need to fill some descriptor structures so type information can be generated for it. We add a few static members:
class MyObject
{
public:
virtual void __stdcall f(int i);
virtual BOOL __stdcall g(float f);
static PARAMDATA f_paramData;
static PARAMDATA g_paramData;
static METHODDATA methodData[];
static INTERFACEDATA interfaceData;
};

Let’s fill those babies up:

PARAMDATA MyObject::f_paramData = {
OLESTR("i"), VT_I4
};
PARAMDATA MyObject::g_paramData = {
OLESTR("f"), VT_R4
};
METHODDATA MyObject::methodData[] = {
{ OLESTR("f"), &MyObject::f_paramData, 1, 0, CC_STDCALL, 1, DISPATCH_METHOD, VT_EMPTY },
{ OLESTR("g"), &MyObject::g_paramData, 2, 1, CC_STDCALL, 1, DISPATCH_METHOD, VT_BOOL }
};
INTERFACEDATA MyObject::interfaceData = {
MyObject::methodData,
sizeof(MyObject::methodData) / sizeof(METHODDATA)
};

For each method of our object, we describe the method’s parameters, giving them name and type in a PARAMDATA structure. We then fill a method table for the object with complete information, including the parameter data, return value type, calling convention and such. The INTERFACEDATA wraps the whole thing in a nice little package to feed CreateDispTypeInfo with.

We now proceed to create an automation wrapper for our pure object:

CComPtr<ITypeInfo> pMyobjTypeInfo;
hr = CreateDispTypeInfo(
&MyObject::interfaceData,
LOCALE_SYSTEM_DEFAULT,
&pMyobjTypeInfo);
CComPtr<IUnknown> pMyobj;
hr = CreateStdDispatch(NULL, &myobj, pMyobjTypeInfo, &pMyobj);

At this point, pMyobj is a full fledged COM object implementing IDispatch and wrapping the MyObject class instance myobj, which had no knowledge of COM originally and now bundles tables describing its methods.

The scripting site’s implementation of IActiveScriptSite::GetItemInfo should now return pMyObj, the object’s IUnknown and potential IDispatch, and pMyobjTypeInfo, its ITypeInfo, when requested to do so by the hosted scripting engine. We register the object we wish to expose with the engine:

hr = pActiveScriptEngine->AddNamedItem(
L"myobject",
SCRIPTITEM_ISSOURCE | SCRIPTITEM_ISVISIBLE | SCRIPTITEM_ISPERSISTENT);

If our GetItemInfo does its job when asked for “myobject”, assuming we host the JScript engine, we can now do things like
myobject.f();
var b = myobject.g(0.4);

in script code running in our host.

I find this approach to automation object exposition attractive because it is non-intrusive. If desired, the tables describing the exposed class need not be members of the actual class, but can be stored separately. Notice that you do not even have to generate a CLSID for the exposed class. It is also possible to expose only a certain subset of class methods to the scripting environment.

However, maintaining the type information tables can become a clear scalability issue with more complicated classes. For these cases, rolling an automatic code generation solution may be desired, since MIDL’s functionality in this department cannot be reused. The class and its methods could be described in an XML file, and a tool iterating over its DOM or even an XSLT transformation could generate a C++ header file from the description, complete with the INTERFACEDATA information. This would ensure the method tables and the actual method signatures remain synchronized over the extended life-time of the class.

Finally, a pointer to some tips and a few words of caution to those interested in this solution, this newsgroup post. Let me add to it that CreateDispTypeInfo only seems to work correctly with the __stdcall calling convention, even if you specify CC_CDECL in your type information. Using CC_STDCALL and making sure your classes use __stdcall made everything work. Before that, symptoms included method arguments receiving seemingly random values when called by the scripting engine, due to stack imbalance.

Hey, I said the approach is lightweight, not the post ;-)

Advertisements

7 thoughts on “A lightweight approach for exposing C++ objects to a hosted Active Scripting engine

  1. I wrote an IActiveScriptSite derived code in C++, in wich I want to expose my DOM objects to the VBScript engine (as you did).
    I add the root object with AddNamedItem, this root object has a child object, so I add a propget to the root object wich return the IDispatch interface of the child object.
    In my VBScript I get an ‘Object required’ error when I try to use the child object.
    Can you help me ?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s