LibreOffice, JavaScript’ed

LOWA is LibreOffice built with Emscripten as a Wasm executable that runs in the browser. Controlling that LibreOffice through UNO with JavaScript looks like a natural fit. Enter Embind, a mechanism to generate the binding glue between JavaScript and Wasm/C++.

As we will see, the Embind vs. UNO match is not perfect, but it kind-of gets the job done, at least for a first iteration.

Mappings

To dive straight into technical matters, the UNO type system is mapped to JavaScript as follows. (If you would like to see some example code first, jump ahead to the Starting Points and come back here later for reference.)

  • UNO BOOLEAN, depending on context and somewhat inconsistently maps to JavaScript Boolean and to JavaScript Number values 0 and 1. (The C/C++ representation of UNO BOOLEAN is sal_Bool, which is an alias for unsigned char, which Embind maps to JavaScript Number. So in places where we directly rely on Embind, like for the return value of a UNO interface method invocation, we get the Embind mapping to Number. But in places where we have more control, like for the JavaScript get method for a UNO ANY, we can be a bit more fancy and use a mapping to Boolean.)
  • UNO BYTE, SHORT, UNSIGNED SHORT, LONG, UNSIGNED LONG, FLOAT, and DOUBLE all map to JavaScript Number (with restricted value ranges for everything but UNO DOUBLE).
  • UNO HYPER and UNSIGNED HYPER both map to JavaScript BigInt (with restricted value ranges).
  • UNO CHAR and STRING both map to JavaScript String (with single UTF-16 code unit strings for UNO CHAR).
  • UNO TYPE maps to JavaScript Module.uno_Type objects. There are construction functions Module.uno_Type.Void, Module.uno_Type.Boolean, Module.uno_Type.Byte, Module.uno_Type.Short, Module.uno_Type.UnsignedShort, Module.uno_Type.Long, Module.uno_Type.UnsignedLong, Module.uno_Type.Hyper, Module.uno_Type.UnsignedHyper, Module.uno_Type.Float, Module.uno_Type.Double, Module.uno_Type.Char, Module.uno_Type.String, Module.uno_Type.Type, Module.uno_Type.Any, Module.uno_Type.Sequence, Module.uno_Type.Enum, Module.uno_Type.Struct, Module.uno_Type.Exception, and Module.uno_Type.Interface for representations of all the UNO TYPE values. The Module.uno_Type.Sequence construction function recursively takes a UNO TYPE argument for the component type, while the Module.uno_Type.Enum, Module.uno_Type.Struct, Module.uno_Type.Exception, and Module.uno_Type.Interface construction functions each take a string argument denoting the given type’s name in dotted notation (e.g., Module.uno_Type.Interface('com.sun.star.uno.XInterface')). Those JavaScript objects implement toString, which is also used for equality checks (e.g., type === 'com.sun.star.uno.XInterface').
  • UNO ANY maps to JavaScript Module.uno_Any objects. There is a constructor taking a UNO TYPE argument and a corresponding value (using an undefined value for UNO type VOID). Those JavaScript objects implement a method get that returns the JavaScript representation of the contained UNO value.
  • UNO sequence types map to a pre-canned variety of JavaScript Module.uno_Sequence_... objects. The problem is that Embind does not let us have a generic mapping to the C++ com::sun::star::uno::Sequence<T> class template; we can only have individual Embind mappings to specific class template instantiations. As a hack, for every UNO sequence type that appears somewhere in the LibreOffice UNO API, we generate a specific JavaScript Module.uno_Sequence_.... The naming is Module.uno_Sequence_boolean, Module.uno_Sequence_byte, Module.uno_Sequence_short, Module.uno_Sequence_unsigned_short, Module.uno_Sequence_long, Module.uno_Sequence_unsigned_long, Module.uno_Sequence_hyper, Module.uno_Sequence_unsigned_hyper, Module.uno_Sequence_float, Module.uno_Sequence_double, Module.uno_Sequence_char, Module.uno_Sequence_string, Module.uno_Sequence_type, and Module.uno_Sequence_any for the simple UNO component types; Module.uno_Sequence_... followed by the UNO type name in dollar-separated notation (e.g., Module.uno_Sequence_com$sun$star$uno$XInterface) for enum, struct, and interface component types; and Module.uno_SequenceN_..., with N greater than 1, for sequence component types (e.g., Module.uno_Sequence2_long for the UNO type “sequence of sequence of LONG“). That means that there currently is just no way to come up with e.g. a JavaScript representation of the UNO type “sequence of interface com.sun.star.frame.XDesktop“, as that sequence type happens to not be mentioned anywhere in the LibreOffice UNO API. (But for those sequence types that are used as interface method parameter or return types, corresponding JavaScript representations are provided. That should hopefully cover all relevant use cases for now; a future overhaul of this part of the mapping is likely.) These JavaScript sequence objects have two constructors, one taking a JavaScript array of member values (e.g., new Module.uno_Sequence_long([1, 2, 3])) and one taking a size and a Module.FromSize marker (as Emind does not allow to have multiple constructors with the same number of arguments) whose members will have default values (e.g., new Module.uno_Sequence_long(3, Module.FromSize)). Additional methods are resize (taking the new length as argument), size (returning the current length), get (taking an index as argument and returning the member at that index), and set (taking an index and a new member value as arguments). (The naming of those resize, size, get, and set methods is modelled after Embind’s emscripten::register_vector.)
  • UNO enum types are mapped to Embind-provided enums named Module.uno_Type_... followed by the UNO type name in dollar-separated notation (e.g., Module.uno_Type_com$sun$star$uno$TypeClass).
  • Plain UNO struct types and UNO exception types are mapped to Embind-provided value objects named Module.uno_Type_... followed by the UNO type name in dollar-separated notation (e.g., Module.uno_Type_com$sun$star$beans$NamedValue, Module.uno_Type_com$sun$star$uno$Exception). Polymorphic UNO struct types face a similar issue to sequence types, in that Embind does not allow to directly map their corresponding C++ class templates. It would be possible to do a similar hack and add specific mappings for all instantiated polymorphic struct types that are mentioned anywhere in the LibreOffice UNO API, but that has not been implemented so far. (And, similar to sequence types, a future overhaul of this part of the mapping is likely.)
  • UNO interface types are mapped to Embind-provided classes named Module.uno_Type_... followed by the UNO type name in dollar-separated notation (e.g., Module.uno_Type_com$sun$star$uno$XInterface). Null references are mapped to JavaScript null. The special com.sun.star.uno.XInterface UNO interface methods queryInterface, acquire, and release are not exposed to JavaScript client code.
  • UNOIDL single-interface–based service constructors are mapped to JavaScript functions named Module.uno_Function_...$$... followed by the service’s name in dollar-separated notation, followed by the constructor’s name set of by two dollar signs (e.g., Module.uno_Function_com$sun$star$beans$Introspection$$create). Like with other UNO language bindings, those functions take the com.sun.star.uno.XComponentContext as an additional first argument.
  • UNOIDL service-based singletons are mapped to JavaScript functions named Module.uno_Function_... followed by the singleton’s name in dollar-separated notation (e.g., Module.uno_Function_com$sun$star$frame$theDesktop). Like with other UNO language bindings, those functions take the com.sun.star.uno.XComponentContext as their (sole) argument.

Starting Points

To make all this work, the Embind mapping of the LibreOffice UNO API needs to be set up first. This is done by a call to

const uno = init_unoembind_uno(Module);

which also returns a wrapper object uno that allows for more natural access to all the UNOIDL entities whose mappings use that dollar-separated notation: Instead of Module.uno_Type_com$sun$star$uno$XInterface one can write uno.com.sun.star.uno.XInterface, and a call to uno_Function_com$sun$star$beans$Introspection$$create(context) can be written as uno.com.sun.star.beans.Introspection.create(context). If you want to cut down on the common uno.com.sun.star prefix even further,

const css = uno.com.sun.star;

lets you reduce that to just css.uno.XInterface and css.beans.Introspection.create(context).

The starting points to access the LibreOffice UNO API from JavaScript are Module.getUnoComponentContext() (returning the central css.uno.XComponentContext, through which all the services and singletons are reachable) and a Module.getCurrentModelFromViewSh() convenience function (returning the css.frame.XModel of the currently showing document). The gitlab.com/allotropia/lowa-demos repository is a growing site of example code showing all of this in action.

Summing this up, here is some example code that iterates over all the paragraphs of a Writer document and gives each of them a random foreground text color:

const uno = init_unoembind_uno(Module);
const css = uno.com.sun.star;
const model = Module.getCurrentModelFromViewSh();
const document = css.text.XTextDocument.query(model);
const text = document.getText();
const access = css.container.XEnumerationAccess.query(text);
const paragraphs = access.createEnumeration();
while (paragraphs.hasMoreElements()) {
  const paragraph = css.text.XTextRange.query(
    paragraphs.nextElement().get());
  const props = css.beans.XPropertySet.query(paragraph);
  const color = new Module.uno_Any(
    Module.uno_Type.Long(),
    Math.floor(Math.random() * 0xFFFFFF));
  props.setPropertyValue("CharColor", color);
  color.delete();
}

Cleanup

Embind is built on the concept that whatever C++ objects you reference from JavaScript, you manually and explicitly need to declare those references as no longer needed once you are done, by calling delete() methods on the corresponding JavaScript objects. (Or else, you risk memory leaks.) This can be quite cumbersome and would pollute the code with tons of such delete() calls. Luckily, JavaScript grew a FinalizationRegistry mechanism that allows code to be executed when the JavaScript garbage collector finds an objet to be unused and reclaims it. (And that code can thus transparently do the delete() call for us.) Embind implements such FinalizationRegistry-support for some types (those that are modelled based on some “smart poiner”) but not for others.

That means that (besides all the primitive types) JavaScript mappings of UNO string, type, enums, sequences, exceptions, and interfaces all do not need explicit delete() calls, while the mappings of UNO any and UNO sequences, and the various Module.uno_InOutParam_... all need explicit delete() calls.

Even though we expect that the JavaScript engines that we target do support the FinalizationRegistry mechanism, Embind is prepared to work with older engines that do not support it. Therefore, whenever an object is transparently cleaned up, Embind logs a somewhat unhelpful warning to the JavaScript console, stating that it “found a leaked C++ instance” (and that it will “free it automatically”).

Interfaces

For each UNO interface type there is a JavaScript class method query taking any JavaScript UNO object reference (in the form of the common com.sun.star.uno.XInterface base interface) as argument (and internally using UNO’s queryInterface to obtain either a correspondingly-typed reference to that object, or a null reference). There is also a JavaScript helper function Module.sameUnoObject, taking two interface references as arguments and returning whether both are references to the same UNO object.

UNO interface methods taking out or in-out parameters need special treatment. There are Module.uno_InOutParam_... wrappers (with a val property carrying the actual value) that need to be set up and passed into the UNO method. Such wrappers have a constructor taking no arguments (creating a dummy object, suitable for pure out parameters) and another constructor taking one argument of the wrapped type (suitable for in-out parameters). For example, to read data from a com.sun.star.io.XInputStream:

const stream = ...;
const input = css.io.XInputStream.query(stream);
if (input) {
  const data = new Module.uno_InOutParam_sequence_byte;
  input.readBytes(data, 100);
  for (let i = 0; i != data.val.size(); ++i) {
    console.log('read byte ' + data.val.get(i));
  }
  data.delete();
}

Exception Handling

Support for throwing and catching exceptions between JavaScript and C++ is rather rough: JavaScript code can use try ... catch (e) ... to catch a UNO exception thrown from C++, but all the information it can get about that exception is e.name stating the exception’s type. Also, for technical reasons, the catch block needs some increment– and decrementExceptionRefcount boilerplate,

try {
  ...
} catch (e) {
  incrementExceptionRefcount(e);
    //TODO, needed when building with JS-based -fexceptions,
    // see
    // <https://github.com/emscripten-core/emscripten/issues/17115>
    // "[EH] Fix inconsistency of refcounting in Emscripten
    // EH vs. Wasm EH"
  if (e.name === 'com::sun::star::uno::RuntimeException') {
    ...
  }
  decrementExceptionRefcount(e);
}

To throw UNO exceptions from JavaScript code, there is a helper function Module.throwUnoException that takes a UNO (exception) type and an instance of that type:

Module.throwUnoException(
  Module.uno_Type.Exception(
    'com.sun.star.lang.IllegalArgumentException'),
  {Message: 'bad argument', Context: null,
   ArgumentPosition: 0});

UNO Objects

The JavaSript-to-UNO binding is a full mapping, so you can even implement new UNO objects in JavaScript. This requires quite some boilerplate, though. For example, the below obj implements com.sun.star.lang.XTypeProvider and com.sun.star.task.XJob:

const obj = {
  // Implementation details:
  implRefcount: 0,
  implTypes: new Module.uno_Sequence_type([
    Module.uno_Type.Interface(
      'com.sun.star.lang.XTypeProvider'),
    Module.uno_Type.Interface(
      'com.sun.star.task.XJob')]),
  implImplementationId: new Module.uno_Sequence_byte([]),

  // The methods of XInterface:
  queryInterface(type) {
    if (type == 'com.sun.star.uno.XInterface') {
      return new Module.uno_Any(
        type,
        css.uno.XInterface.reference(
          this.implXTypeProvider));
    } else if (type == 'com.sun.star.lang.XTypeProvider') {
      return new Module.uno_Any(
        type,
        css.lang.XTypeProvider.reference(
          this.implXTypeProvider));
    } else if (type == 'com.sun.star.task.XJob') {
      return new Module.uno_Any(
        type,
        css.task.XJob.reference(
          this.implXJob));
    } else {
      return new Module.uno_Any(
        Module.uno_Type.Void(), undefined);
    }
  },
  acquire() { ++this.implRefcount; },
  release() {
    if (--this.implRefcount === 0) {
      this.implXTypeProvider.delete();
      this.implXJob.delete();
      this.implTypes.delete();
      this.implImplementationId.delete();
    }
  },

  // The methods of XTypeProvider:
  getTypes() { return this.implTypes; },
  getImplementationId() {
    return this.implImplementationId;
  },

  // The methods of XJob:
  execute(args) {
    if (args.size() !== 1 || args.get(0).Name !== 'name') {
      Module.throwUnoException(
        Module.uno_Type.Exception(
          'com.sun.star.lang.IllegalArgumentException'),
        {Message: 'bad args', Context: null,
         ArgumentPosition: 0});
    }
    console.log(
      'Hello, my name is ' + args.get(0).Value.get());
    return new Module.uno_Any(
      Module.uno_Type.Void(), undefined);
  },
};
obj.implXTypeProvider
  = css.lang.XTypeProvider.implement(obj);
obj.implXJob
  = css.task.XJob.implement(obj);

obj.acquire();
// ... pass obj to UNO here ...
obj.release();

Leave a comment