Composition

Updated: April 19, 2023

Composition is the process of combining multiple content sources together into a single image.

It's what happens when you have multiple visible windows to be shown on one display. Screen performs the composition while ensuring that everything that's supposed to be visible is visible and everything that's supposed to be covered is covered. As you can probably guess, when you consider the number of properties for windows and displays alone, along with each of the multiple windows, there's almost an infinite number of possibilities on how to combine the images.

Because the final image needs to be correctly constructed, Screen can't necessarily use one composition option solely over another. For example, even a gain in performance isn't justified if the resulting image produced isn't accurate.

Screen uses a composition strategy that's optimal for the current scene while taking memory bandwidth and available hardware blocks into consideration. When multiple hardware layers (pipelines) and buffers are supported by the device driver, Screen takes advantage of these capabilities to use each pipeline and to combine the pipelines at display time. For applications that require complex graphical operations, hardware-accelerated options such as 3D hardware with OpenGL ES and/or 2D bit-blitting hardware are also used. And there are times that even using the CPU for composition provides some advantages.

Hardware layers (pipelines)
Advantages
  • window buffers don't need to be copied to a composite framebuffer
  • no processing power of CPU and/or GPU required to compose buffers
  • efficient in handling windows with high-frequency updates
  • scaling quality is usually good
  • updates can be independent from other windows
  • efficient in power usage (e.g., the hardware block is already powered to update the display; when the display is updated, blending the hardware pipelines can be performed by the display's specialized hardware)
Disadvantages
  • limited by pipeline capabilities, such as scaling, ability to rotate, supported pixel formats, and image manipulation (brightness, contrast, etc.)
  • limited by the number of supported pipelines, which can vary from platform to platform
  • can't display more than one buffer per pipeline
  • some displays refresh constantly, even if there's no change in content
2D bit-blitting hardware
Advantages
  • no restrictions on the number of pipelines
  • all sources are read only when needed (i.e., no active refreshes)
  • reads can be focused to only regions of the buffer that's been changed (dirty rectangles)
  • typically, there are no restrictions on scaling
  • generally lower powered than using 3D hardware
  • blit can be performed in parallel with 3D hardware; 2D bit-blitting can offload work from 3D
Disadvantages
  • window buffers, or part of them, need to be copied to a framebuffer
  • may require processing power of CPU and/or GPU to combine buffers
  • possible restrictions on supported pixel formats, rotation.
  • scaling quality may not be as good as hardware layers or 3D hardware
3D hardware
Advantages
  • no restrictions
  • reads of sources are performed only when needed (i.e., no active refreshes)
  • sometimes 3D hardware with shaders might be the only way to get certain effects
  • there aren't any hardware variants because OpenGL ES is a standard
Disadvantages
  • window buffers, or part of them, need to be copied to a framebuffer
  • draws more power than other options
  • contention over 3D hardware resources
CPU
Advantages
  • no restrictions
  • reads of sources are only when needed (i.e., no active refreshes)
  • there are no setup costs (e.g., waiting for hardware to become available)
  • faster if you have many small operations
Disadvantages
  • window buffers, or part of them, need to be copied to a framebuffer
  • usually slower than other composition options because of the fill rate; on some CPU architectures, the graphics memory can't even be written to by the CPU
  • concerns with the cache (flushing, invalidating, or both) arise when combining CPU and hardware composition options

When Screen is tasked to show your windows' content, it's going to determine how many windows are visible, what portions of the window are visible, and then based on the windows' properties, whether scaling, or rotation, or image adjustments need to be considered. Screen decides what hardware blocks, or CPU, to use for composition based on all these factors.

Screen API allows the application to limit composition options by setting the following properties:

SCREEN_PROPERTY_PIPELINE
Specifies the hardware layer.
SCREEN_PROPERTY_USAGE
Set the SCREEN_PROPERTY_USAGE_OVERLAY bit to specify use of one of the hardware layers.

You can set the above properties by calling screen_get_window_property_iv().

2D bit-blitting and 3D hardware require the appropriate hardware drivers to have started before Screen can use them. If, at the time that Screen needs the drivers for composition, the drivers haven't already been started by an OpenGL ES call, or by call to screen_blit(), Screen starts these drivers. Therefore, the first time that Screen uses 2D bit-blitting for composition may take longer if Screen is loading and starting the drivers before the actual task of compositing.

If Screen chooses to use either 2D bit-blitting, 3D hardware, or CPU, then Screen uses a framebuffer to save the composition results. The reason for this, is that these options need to save to the graphics memory. If there isn't already a framebuffer available, Screen creates one at the point of use. Screen can reuse the same framebuffer without having to do any sort of chaining because typically, a framebuffer isn't used for any other purpose than for showing content.

The composition strategy (whether one option or combined options are used) taken by Screen to produce the final image may vary, even on the same platform. It could be that in one case only hardware layers are used, but at another time, on the same system, a completely different combination of hardware blocks and CPU are used. Screen makes the decision on composition based on the visible windows at the time. Different applications doing different things at different times can cause Screen to perform composition in different ways. Regardless of the combination of composition options that Screen chooses, the goal is an accurate final image where performance load and memory are optimized.



Figure 1. An example of a combination of composition options with three windows, one composite framebuffer, and two supported pipelines

Timing

Screen tries to minimize the number of display updates required. That's why if there are multiple visible windows that are not necessarily updating all at once, but within a vertical synchronization (vsync) interval, the updates are batched together and updated on the next vsync interval. This strategy is more efficient than updating the display every time there's a change in a window.

For example, if there are three visible windows that all have content to show at approximately the same time (i.e., within one vsync internal), then Screen will take the updates of all three windows, and composites them to produce only one display update. If only one of the windows has a content change, while the other two have content changes outside of the same vsync interval, then Screen may update the first and batch the other two in a second update for the next vsync interval.

Screen still needs to account for invisible windows when they update. Even though changes in invisible windows don't result in a display update, they still use and hold resources. Eventually, the invisible windows might block resources for other visible windows if we just leave them. Therefore, even invisible window updates are constrained by the vsync intervals.

Composition time
When composition is active, you can check the average composition duration over the last 1, 2, and 5 seconds. See the At runtime subsection of the Debugging section for more information.