Bridging the gap between native functions and Active Scripting with a COM-based FFI wrapper

A few weeks ago I was following the excitement as WebKit, Safari’s browser engine, incrementally passed more and more of the Acid 3 standards test. Wondering if the Gecko (Mozilla Firefox’s rendering engine) folks are also busy with that, I followed both the Planet WebKit and Planet Mozilla feeds for a few weeks.

Sometime in April I stumbled upon this post in Planet Mozilla. It discussed recent improvements to JSctypes. It was the first time I had heard of this project. JSctypes is an XPCOM component for Mozilla that allows calling native (or “foreign”) functions from privileged JavaScript code. Both the interface and name are inspired by the Python ctypes module, included with the standard distribution since version 2.5.

If you haven’t heard of ctypes, take a minute to get acquainted. It’s a great library that allows you to call native C functions dynamically from Python code. Its interface really feels at home in a dynamic language. Most of the time, you can just call functions without specifying the number and types of the arguments they receive. DLL modules can be accessed as attributes of the module attribute matching their calling convention (e.g., ctypes.windll.kernel32 or ctypes.cdll.msvcrt) and script functions can be passed as callbacks to the native APIs being invoked.

JSctypes takes Python’s ctypes concept into Mozilla’s JavaScript implementation. Mozilla has a COM-like architecture at the base of its object model which is called XPCOM. Usually, calling native functionality from JavaScript is achieved by exposing an XPCOM component to script. However, such an approach has clear disadvantages as every conceivable native functionality needs to be wrapped on a case by case basis by a compiled XPCOM component. Now, with JSctypes, Mozilla’s JavaScript code, when privileged (obviously a native call interface is not appropriate in the context of untrusted web content), can call most native functions with relative ease and without a compiled component, aside from JSctypes itself.

A native function interface for a dynamic language needs to deal with the relatively complex task of setting up the call stack frame for an arbitrary native API, according to argument counts, types and alignment requirements deduced dynamically at script execution time. As the interface layer seeks to support a broader and broader variety of argument types (basic data types, then structures, arrays, then callback functions, etc.) the task becomes increasingly complicated and difficult.

I reviewed both JSctypes and Python’s ctypes source code in their respective source code repositories and learned that they both share a common implementation of the lowest component in such a native interface layer. It is called libffi, the Foreign Function Interface library and seems to originate from the gcc project. Since libffi is designed to be compiled with a UNIX-style toolchain (has AT&T syntax assembly files, for instance) and Python needs to compile with Visual C++, the author of ctypes, Thomas Heller, ported an old revision of the library to Visual C++.

Usage of libffi is pretty simple. You initialize an ffi_cif (call information?) structure with the ABI type, return value type, argument count and argument types of the native function to be invoked by using the ffi_prep_cif function. Later, and repeatedly as needed, ffi_call is used to call the actual function with a specific set of argument values, passed in as an array and to retrieve the value returned from the native function.

I thought JSctypes is really cool and it then occurred to me it should not be prohibitively difficult to implement a similar adaptation layer for Microsoft’s JScript and possibly other Active Scripting languages.

In my mind’s eye, I envisioned an in-process COM server accessible to Active Scripting clients (implements IDispatch and associated with a ProgID) providing a call interface to arbitrary native functions.

I created an ATL COM DLL and gave the coclass the ProgID “FunctionWrapper.FunctionWrapper.1”. I knew you could call JScript functions with less or more arguments than they expect in their definition and figured pulling off the same in a native method I’ll expose to the script would be ideal. After a short investigation I learned of the IDL vararg attribute, which accomplishes just what I had in mind. At this point, the exposed interface looks like this:

[
object,
uuid(EBA4A11F-969B-4413-9D4E-FB5CB21039FC),
dual,
nonextensible,
helpstring("IFunctionWrapper Interface"),
pointer_default(unique)
]
interface IFunctionWrapper : IDispatch {
[id(1), helpstring("method CallFunction"), vararg] HRESULT CallFunction([in] SAFEARRAY(VARIANT) args, [out, retval] VARIANT* retVal);
};

The CallFunction method of the FunctionWrapper object is callable by JScript clients with arguments of arbitrary count and type of their choosing. As a simplistic start, I had the first argument specify a string identifying the native function, in the Windbg-inspired syntax of “module!export”, e.g. “user32!MessageBoxW”. The rest of the arguments would be passed to the native function.

I proceeded to implement CFunctionWrapper::CallFunction. The steps taken by the method would be:

  1. Ensure at least the first argument (function to invoke) was given.
  2. Ensure the first argument specifies a module and an export, load the module and retrieve the address of the export.
  3. Thunk the VARIANT arguments received by the method to libffi-style argument and types arrays.
  4. Invoke ffi_prep_cif to prepare the call and call the native function with ffi_call
  5. Thunk the return value of the function into a VARIANT usable by script.

Much of the work here is concise but stage 3 consists of relatively mundane boilerplate, translating two varieties of dynamically typed data, Microsoft’s VARIANT and libffi’s ffi_type. I’ll illustrate with a short snippet:

for (ULONG i = 1; i < arguments.GetCount(); i++)
{
ffi_type* argumentTypes = ...; // Dynamically allocated by argument count
void* ffiArgs = ...;


VARIANT& arg = arguments[i];
switch (V_VT(&arg)) {
case VT_UI1:
argumentTypes[i - 1] = &ffi_type_uint8;
ffiArgs[i - 1] = &(V_UI1(&arg));
break;
...
case VT_UI4:
argumentTypes[i - 1] = &ffi_type_uint32;
ffiArgs[i - 1] = &(V_UI4(&arg));
break;
}
}

Similar work is needed for other integer and floating-point types, strings and pointers.

Initially, I hard-coded a return value type of unsigned 32-bit integer and the stdcall calling convention to avoid providing an interface for selecting those parameters. I registered the DLL and tested the following script with WSH:

var functionWrapper = new ActiveXObject("FunctionWrapper.FunctionWrapper");
var retVal = functionWrapper.CallFunction("user32!MessageBoxW", 0, "text", "caption", 1);
WScript.Echo(retVal);

1 is also the value of the MB_OKCANCEL parameter to MessageBox. I used the W variety of the API since I implemented hardcoded UTF16 marshalling for VT_BSTR type variants, which is the form strings come in from JScript.

I was quite content when the test script not only failed to crash the WSH process, but also successfully presented a message box and provided the API’s return value successfully back to JScript.

At this point I considered what would it take to extend this solution beyond the basic value types. Arrays first came to mind. Such support, I imagined, would consist of copying an incoming SAFEARRAY argument into a native array and supplying the native array pointer to the native function. If “out” array argument support is desired, copying back into the SAFEARRAY would be required post-invocation, right after ffi_call.

Next in line were structs. These would be less straightforward. The problem with filling a JScript “object” (read, hash table) with a struct’s fields is that ordering would not be preserved as the order in the struct’s data layout. Using the hash as a JScript array would provide ordering, although it wouldn’t be very nice looking.

The final type of argument I considered, and arguably the most important, is callbacks. Many APIs take function pointers as arguments. Consider EnumWindows which invokes EnumWindowsProc on every window found. A native call interface should provide a capability to implement the callback as JScript function and pass it as seamlessly as possible during the native invocation.

Fortunately, libffi provides built-in support for callbacks, calling them “closures” in its terminology. An ffi_cif structure is initialized to describe the prototype of the callback function, in native eyes, as it if it were going to be called with ffi_call. ffi_prep_closure takes such a prototype description, a function pointer and a closure “trampoline buffer”, as I call it. The trampoline buffer, expected to be allocated in writable, executable memory (native code would later jump into its address) takes care of calling the provided function pointer. The twist is that the function pointer, instead of being called with a dynamic prototype, always receives its arguments in the form of libffi argument arrays.

The native callback function wrapped by the closure trampoline buffer would presumably fill a SAFEARRAY of variants with the arguments and invoke a script function. A wrapper callback coclass could be provided to the script and allow for more elaborate stuff like out parameters and the like. An instance of the callback object would wrap a JScript function object and invoke its apply method using the IDispatch interface as calls come in through the closure. It is unclear what a generic solution that doesn’t rely on functions being objects and having the apply method would look like, so at this point this wrapper callback concept is only suitable for JScript.

Right now I only got as far as implementing just the basic value types, and even that with code of such poor quality I avoid uploading it for the time being. The devil is in the details and supporting describing complex argument types would require quite a bit of work. Hopefully someday I or perhaps an enthusiastic reader would get around to coding and publishing a full-fledged implementation of a native call interface. Embedding such an interface in an Active Scripting host in scenarios where the hosted scripts enjoy full trust could provide endless extensibility possibilities for the script author.

Hey, cooler than P/Invoke…

Advertisements

Windbg 6.9.3.113 released

A new version of the Debugging Tools for Windows appeared, quietly as usual, on Microsoft’s web site a few days ago.

Unfortunately the debugging symbols package for Windows XP SP3 is still MIA, presumably being delayed along with widespread SP3 availability on the Download Center and Windows Update. My local symbol store has grown quite obese with all the SP2 patches over the years so I’m looking forward to clean things up once that’s available.

Nothing too exciting in the RELNOTES.TXT for this release. Integrated managed debugging remains dysfunctional so the trusty 6.7.5.0 remains in place for that. Can’t even get SOS to break on an application’s Main method. The most exciting feature is enhancements to the “dt” command.

Yawn.