Multimodal Interactor Mapping Model Specification

W3C Working Group Submission August 27th 2012

This version
Latest version
Previous version
Sebastian Feuerstack, UFSCar

This document is available under the W3C Document License. See the W3C Intellectual Rights Notice and Legal Disclaimers for additional information.

Table of Contents


Multimodal mappings are the glue to combine the AUI, CUI and interaction resource interactios. Thus, they are used to describe an interaction (step) that can involve several modes and media in a multimodal manner. Further on the mappings can be specified on an abstract level. In this case overall interaction paradigms like e.g. a drag-and-drop can be specified.

Mappings rely on the features of the state charts that can receive and process events and have an observable state. Thus, each mapping can observe state changes and trigger events.

This specification distinguishes between synchronization and multimodal mappings. The former ones synchronize the interactor's state machines among different models of abstractions, like between task and AUI or AUI and CUI interactors. The latter ones are used to synchronize modes with media.

Multimodal Mappings

Multimodal mappings can be pre-defined(e.g. to support a certain form of interaction with a particular device or to implement an interaction paradigm like drag-and-drop) but are usually designed during application design (e.g. stating that a security critical command must be confirmed with a mouse click and a voice command).

There are three basic concepts that each multimodal mappings consists of: Observations, Actions, and Operators. They are described in the following sections.


Observations are used to observe state charts (state machines) for state changes and are defined by boxes with round edges. A set of observations can be defined vertically. An observation contains two mandatory fields that includes the interactor name (with path information separated with dots) that is terminated with a state name to observe and the observation processing method (the first field separated by a vorizontal line) that should be used.

Thus, an observation that waits for an arbitrary AIO Interactor to enter state presenting would be specified like this:

By two optional attribute an observation can be limited to an interactor with a certain name and the resulting interactor can be stored in a variable. The followin observation waits for an interactor named bob to enter state presenting and safes the interator in variable aio:

Observations can be negated by a preceding NOT statement in the observation text.

Proccessing Methods

Its si distinguished between three different ways to process an observation: onchange, instant, and continuous. onchange is the default one and waits for an an interactor to enter the defined state. onchange does not check the current state of the interactor. Thus, if the interactor is already in the defined state the observation would still wait till the state has been re-entered. instant does not wait for a state change, but instantely checks if the current state of the interactor matches the observation. An instant observation fails if the current interactor state does not match the observation. An continuous observation consideres both and therefore checks the current state and if it is not matching the observation it will continuously wait for a state change to match the observation.



Actions are used to trigger state changes by sending events to start charts or to call functions in the backend. Actions are defined by boxes with sharp edges and are defined vertically like observations. Different to the observations, which are processed based on the connecting operator, actions are always processed sequentially top-down.

Event Action

An event action sends the specified event to all interactors stored within a variable.

Backend Action

A backend action calls the functional core of an interactive application. It refers to a static function of a module (optional). further on, an abitrary number of parameters can be set. The function is required to return true on success or false on failure. A variable can be used as a paraemeter.



Operators specify multimodal relations and link a set of observations to a set of actions. A cycle with the initial capital letter is used to specify an operator.

There are five operators available: Complementary (C), Assignment (A), Redundancy (R), Equivalence (E), Sequence (S).

The first four (the CARE properties) have been defined earlier as part of the TYCOON framework [1] and have been detailed to describe relationships between modalities and applied to tasks, interaction languages and devices in [2].

Equivalence describes a combination of modalities in that all can be used to lead to the same desired meaning, but only one modality can be used at a time. Thus, a text input prompt can be for instance either handled by a spoken response or by using a written text typed by a keyboard.

By assignment a particular modality is defined to be the only one that can be used to lead to the desired meaning. (e.g. a car can only be controlled by a steering wheel). Modalities can be used redundant if they can be used individual and simultaneously to express the desired meaning. Hence, for instance speech and direct manipulation are used redundantly if the user articulates show all flights to São Paulo while pressing on the Button São Paulo with the mouse.

In a complementary usage of modalities, multiple complementary modalities are required to capture the desired meaning. For instance a put that there command requires both speech and pointing gestures in order to grasp the desired meaning.

Each operator can optionally state a Temporal Window (Tw), which is specified on top of the operator cycle as well as an exit condition with specified by an arrow under the operator cycle to state a set of action to happen if the multimodal relation could not be fulfilled.

Synchronization Mappings

Mappings are used to synchronize interactors of different levels of abtraction, like abstract and concrete interactors. This is an examplary mapping that synchronizes the AIO presenting state to a CIO displaying state:

Mapping Since Updated
aio_present_to_cio_display.xml 20120827 -

Examplary Mappings

Several exemplary mappings can be found in the Interaction Resource Model Specification

XML-based save format

Mappings are stored in XML as it is specified by the Mapping XML Schema Definition.

Grafical Notation

We have published an open source tool that generates a graphical representation of mappings from their XML definition.

Reference Implementation

An open source project that implements tools and a platform to enable the design an execution of multimodal interfaces for the web can be found here:


Changes since February 3rd, 2012 version


[1] Jean-Claude Martin: TYCOON: Theoretical Framework and Software Tools for Multimodal Interfaces; Intelligence and Multimodality in Multimedia interfaces, AAAI Press; 1998.

[2] Laurence Nigay und Joelle Coutaz: Multifeature Systems: The CARE Properties and Their Impact on Software Design; in Intelligence and Multimodality in Multimedia Interfaces; 1997.