Automatic Speech Recognition (ASR)

The platform includes an ASR subsystem that provides speech-recognition and text-to-speech services to other system components and third-party applications.

To start an ASR session, tap the Push-to-talk tab on the taskbar, then wait for the audible cue before you say a command. Shorter commands have lower success rates.

For a listing of commands you can use, see "Supported voice commands".

For more information about using ASR for different tasks, see the task-specific pages (e.g., "Media Player"). For more information about the ASR modules, see the following:

ASR grammars

You can modify the grammars specified in the /etc/asr-car.cfg file to define keys (synonyms) for the supported speech commands. The grammars reside in the car-control.cfg files that are listed in the localized-assets section of asr-car.cfg for each module. For example, the grammar for the car-media module is located at $(locale-dir)/car-media/car-control.cfg.

Recognition latency

Several factors affect the latencies of voice-command recognition:

ASR server congestion

Server usage might be higher than usual. Run sloginfo -w to determine if the latency is on the recognition server.

Text-to-speech (TTS) latency

A slow TTS response can affect the perceived responsiveness of the system. The latency of the message that announces what is being done, or an unrecognized command, might appear to a user to be associated with a voice-recognition issue. In this case, the output of sloginfo -w should give you a good sense of the TTS latency as well. The service will log the message to be spoken before sending the request to the ASR service.

Determining unrecognized commands

If you say a command that the system doesn't recognize, there are a number of ways you can get more information about how the command was interpreted. You can:
  • see the interpreted command on the screen
  • examine the system log. To examine the system log and search for a particular command, run the following:

    sloginfo -w | grep utterance

  • examine the /pps/services/asr/control object to see what ASR understood and what intents it extracted from the command. For example, the command "Switch to media player" results in the following update to the PPS object:
    @control
    result:json:{"confidence":925,
                 "recognizer":"io-asr-nlal",
                 "status":"result_ok",
                 "type":"intent",
                 "action":"launch",
                 "utterance":"Switch to media player",
                 "intents":[{"field":"application","value":"media player"}]}
    speech::handled
    state::idle
    strobe::on