PCM software mixer

Updated: April 19, 2023

The PCM software mixer allows multiple application streams to play to a single hardware device concurrently by mixing together the audio samples of all of the application streams. When you enable PCM software mixer reference streams in the audio configuration file (using sw_mixer_max_references), it also creates a capture device that provides the mixed playback audio stream as a reference audio stream/signal.

A PCM software mixer instance is automatically created for each hardware PCM playback device, even if the hardware supports multiple hardware subchannels. Requests from client applications to the hardware device are routed to the PCM software mixer automatically. You can disable the PCM sofware mixer using the -o disable_sw_mixer option.

Note: When the PCM software mixer is not disabled, the CPU is used more than if you use the hardware device without the mixer, even when there is only one stream.

The PCM software mixer does not use a separate PCM device. Instead, it directly overlays the hardware PCM device created by the deva-ctrl-* DLL. This overlay means you can't access the hardware directly.

When you enable PCM software mixer media references, a new PCM capture device is created with an index that is one greater than the software mixer. When you capture from this device, you get the mixed audio stream from the application mixer stage. You cannot capture from the PCM software mixer reference if the PCM software mixer isn't actively mixing audio streams.

Client application fragment size

Client applications that attach to a PCM software mixer device can use a fragment size (in bytes) that is any multiple of the PCM software mixer's underlying fragment size. For example, if the PCM software mixer's fragment size is 4 KB, then the application's fragment can be a multiple of 4 KB.

You can use one of the following options to size the PCM software mixer fragments:

When you use this option, io-audio calculates a fragment size based on the requested period and the audio controller configuration. You can only specify this option in the audio configuration file. Use the following format:
sw_mixer_ms=<milliseconds per fragment>
<milliseconds per fragment> must be an integer.
Use this option to specify a fragment size in samples per voice instead of a value in millseconds. Use the following format:
sw_mixer_samples=<fragment size in samples per voice>
Use the following formula to calculate <fragment size in samples per voice>:

Audio controller sample rate * Desired time period / Audio controller number of channels

For example, if the audio controller sample rate is 48 kHz, the time period is 16 ms, and the number of audio controller channels is two, <fragment size in samples per voice> is 384:

( 48000Hz * 16ms / 1000 ) / 2 = 384

You cannot use sw_mixer_samples if you are using the PCM input splitter.

Note: The fragment size is bound by the max_dma_buf_size configuration which defaults to 256KB.

Pausing the software mixer audio stream

When your subchannel goes through the software mixer ("sw_mixer") and you pause the audio stream (by calling snd_pcm_*_pause()), the snd_pcm_*_pause() call blocks until the software mixer completes ramping down the volume to zero. The subchannel transitions to the PAUSED state only after the volume is zero. If you don't use a software mixer, no ramping occurs.

Avoiding distortion with a predictive limiter

The PCM software mixer operates on 16 or 32-bit PCM data. When mixing multiple audio streams, the accumulated audio data can overflow and may need to be clamped to a minimum/maximum of 16 or 32-bit values. This clamping causes distortion, which is more noticeable the more the signal has overflowed into an out-of-bounds region. The degree and frequency of the overflow increases with the number of audio streams that are concurrently mixed.

To avoid the distortion, you can enable a predictive limiter (sw_mixer_limiter) on the accumulated mixer output that uses a one millisecond delay to look ahead and observe the impending amplitude of the output audio signal. Using that information, it calculates an attenuation that reduces the signal below the PCM limit. If the output signal is below the limit, the attenuation is gradually reduced back to zero. This provides a much smoother signal response than direct clamping, and allows many audio streams to be mixed concurrently without any noticeable audible distortion.

The use of a limiter is recommended on target systems where multiple audio streams are played concurrently and where audio streams use the full 16 or 32-bit PCM range. If there's normally only one audio stream playing or the audio streams that are being mixed have low amplitude, then the limiter isn't required. It's important to note that the playback output incurs a one millisecond delay if the limiter is enabled. For more information, see the sw_mixer_limiter configuration options.