The recognizer interface.
struct asr_recognizer_if asr_recognizer_if_t { const char * name ; const char * version ; int(* init )(cfg_item_t *config_base); int(* cleanup )(); int(* start )(); int(* stop )(); void(* step )(asr_step_t step); asr_context_hdl_t *(* context_create )(cfg_item_t *cfg); int(* context_save )(asr_context_hdl_t *hdl, cfg_item_t *cfg); int(* context_add_entries )(asr_context_hdl_t *hdl, cfg_item_t *cfg, const char *slot_identifier, asr_slot_entry_t *slot_entry, int num_slot_entries); int(* context_delete_entries )(asr_context_hdl_t *hdl, cfg_item_t *cfg, const char *slot_identifier, asr_slot_entry_t *slot_entry, int num_slot_entries); int(* context_destroy )(asr_context_hdl_t *hdl); int(* get_utterance )(asr_audio_info_t *audio_info); int(* set_utterance )(asr_audio_info_t *audio_info, uint32_t offset_ms); };
The version number is used to prevent newer, incompatible modules from being used with an older build of ASR.
The io-asr service calls init() for each registered module on startup. The init() function sets the recognizer properties. The properties that are required vary by vendor.
The io-asr service calls cleanup() after shutting down a module to release any memory, destroy mutexes or condvars, or handle any data that must be changed as a result of the module shutting down. The exact requirements of the cleanup vary by vendor.
The io-asr service calls start() to start a recognition request. The recognizer should collect and process the audio sample, and then provide status and results via the API defined in the ASR vendor interface, asrv.h. This call must be asynchronous and the recognition operation started must be interuptable via a call to the stop() callback.
The io-asr service calls stop() to stop the current recognition operation. The recognizer stops audio acquisition and stops processing results.
The io-asr service calls step() when the module's current step changes. The step() function takes the appropriate action depending on what the step is.
The io-asr service calls context_create() during the recognition process to create a recognition context.
After io-asr has created a context by invoking context_create(), it calls context_save() to save the context in the recognizer's required format, which varies by vendor.
The io-asr service calls context_add_entries() to add additional entries to the specified context.
The io-asr service calls context_delete_entries() to remove entries from the specified context.
The io-asr service calls context_destroy() to destroy a context.
The get_utterance() function stores an audio sample in the buffer referenced by the info parameter. It also sets the associated properties of the utterance: buffer size, sample size, sample rate, and number of channels. The get_utterance function waits until the audio capture has completed before copying the sample and returning.
The set_utterance() function copies the last captured audio sample to the buffer referenced by the info parameter, at the offset specified by the offset_ms parameter. The sample size, sample rate, and number of channels must match the properties of the captured sample. If the requested offset results in a buffer overrun, an error is returned. If the audio capture has not completed, an error is returned.
The recognizer interface provides functions to io-asr for managing speech-to-text processing. Each recognizer module's constructor function passes this structure to asr_connect().