The car-media module

The car-media module uses one of two multimedia services to perform media operations: mm-player or mm-control. From the user's perspective, these services behave exactly the same, but what happens behind the scenes is different, so there are separate plugins. Each of the plugins provides different private data, aggregated into the main car-media data structure. These distinct plugins can be loaded exclusively to provide multimedia voice support, but they both can't be loaded simultaneously.

Each media plugin of the car-media module provides an initialization function, which is invoked when the module is registered with the ASR system. Similarly, each plugin provides a tear-down function that ensures that any allocated resources are released when the module is unloaded. This function is called by the destroy() callback function.

The mm-player plugin

The mm-player back end of the car-media module relies on a C API to communicate with the media service. This requires the plugin to open a handle to the mm-player service. For more information about the mm-player API, see the Multimedia Player Developer's Guide

The mm-control plugin

The mm-control plugin of the car-media module communicates with the media service using the control and status PPS objects for mm-control.

Note: The mm-control service is being deprecated in favour of mm-player.

The control object provides an access point for clients to publish commands and messages to the mm-control service. The mm-control service publishes status information about the media, such as the ID of current track and the playback speed, to the status object.

The car-media module defines a helper function that parses the data from the PPS objects.

For more information about the mm-control service, see the Multimedia Controller Configuration Guide. For information specific to the mm-control PPS objects, see the /pps/services/mm-control/control and /pps/services/mm-control/<playername>/status entries of the PPS Objects Reference.


The car-media module defines a number of actions that can be initiated via voice control. These are grouped into high-level workflows that aren't directly tied to specific commands words or utterances, making the module easily adaptable to multilingual environments. Each action ID is associated with a rule string. To enable the car-media module for a particular language environment, a locale-specific grammar simply associates a grammar entry to the rule string corresponding to the correct action ID.

The actions supported by the car-media modules are:

Conversation flow

The action IDs described previously are used internally by the module to specify the current state of the conversation flow. Two states must be accounted for:

ASR state transitions

Like the search module, the car-media model keeps track of the current ASR state. For details, see ASR state transitions in the section "The search module".

The step() function handles the following transitions:

Result handling

Like the search module, the car-media module uses the on_result() callback function to process recognition results. For details, see Result handling in the section "The search module".


Like the search module, the car-media module provides two grammars that define its behavior (one for extracting intents and one for building a context for the third-party recognizer). For details, see Grammars in the section "The search module".

The car-media grammar can be separated into two high-level groups: control and search. The control group involves all conversations that control the playback. The search group involves all conversations that set up tracksessions based on specified search terms and categories.

In addition, a default grammar for the media metadata list is provided. This grammar defines the media-name slot as being a null feature (i.e. it will not be matched). The context created by this grammar is replaced when new media is added to the system. This context simply serves as a placeholder, allowing the car-media module to work until a proper media data context is generated. This substitution is done at runtime as described in the following section.

Updating the guest context

The car-media module defines several helper functions that are used to construct an internal list of media data. The media metadata is obtained from internal databases that are created when media sources are connected to the device.

The media metadata is constructed in three passes:

  1. song titles
  2. artist names
  3. album names

These three passes are repeated once for each synchronized media source. Synchronized media sources are listed by querying the mm-detect PPS status object.

The media metadata list must be updated whenever a new media source is synchronized. To facilitate this, the car-media module spawns a monitor thread that listens to the mm-detect service to be notified when new media sources are connected or synchronized.