Audio terminology

Updated: April 19, 2023

SWAM is based on several interrelated concepts, which we explain here.

Audio channel
A connection to io-audio over which an audio stream flows. Each channel is assigned an audio type when created, which determines the effect that the channel's going active (the audio starts flowing) or inactive (the audio stops flowing) has on other channels.
Audio policy
The mapping of audio types to audio attributes. These attributes are specified in a policy configuration file that's read by io-audio in the hypervisor host and by the equivalent audio services in guests. SWAM enables enforcement of the host's audio policy across OSs, which results in ducking, suspending, and pausing of channels as needed.
Audio stream
A flow of audio data from an application to io-audio, which sends the data to hardware (e.g., speakers).
Audio type
An audio channel attribute that maps it to other attributes that determine the channel's priority, transient status, preemptability, and ducking effect on other channels.
Audio zone
A playback region that contains specific devices, such as the front speakers or rear speakers in a car.
Ducking
Reducing the volume of audio channels when another high-priority audio channel becomes active. When an audio channel is fully reduced in volume (i.e., ducked to 0%, or muted), the data flow may be suspended, depending on the policy configuration. For more details about audio ducking, see Understanding audio ducking in the Audio Developer's Guide.
Non-transient audio
Long duration audio that's driven by user interaction; typically this refers to media playback.
Pausing
Suspending audio flow indefinitely, so it won't be automatically resumed when any high-priority audio finishes playing. Paused audio must be explicitly restarted in the controlling application.
Suspending and resuming
Preventing audio data from flowing when its audio channel is ducked to 0% but then allowing it to flow again (resuming it) when the high-priority audio finishes playing.
Transient audio
Short duration audio that's driven by external events rather than user interaction, such as navigation announcements or warning chimes.