At runtime

Updated: April 19, 2023

Issues that arise when clients are running are the most difficult to pinpoint.

Some of these issues can be:

There are tools that allow you to investigate all of the above without having to instrument the client. You can also check slog2info for possible errors.

Requests

You can figure out the requests that are sent from the clients to the server by dumping the contents of the /dev/screen/requests file. This file may not be there if it's been disabled in graphics.conf, or if it has been explicitly disabled in the configuration file. There might be a lot of requests in that file. The requests logsize is usually set to 64 KB. You can change this number to be greater if you can't get to the file before requests of interest are flushed out. See Configure globals subsection of Configuring Screen for more details.

The /dev/screen/requests file is in binary format. It actually points to a ring buffer where messages are saved as they are received. To make sense of this binary format, you can use the screeninfo tool. See screeninfo under Screen Utilities for more details.

Here's the command you'd use:

# screeninfo /dev/screen/requests
            

and here's what you could see:

context=15;subtype=WIN_GET_EVENT;timeout=-1 [EOK]
context=14;subtype=WIN_GET_EVENT;timeout=-1 [EOK]
context=5;subtype=WIN_GET_EVENT;timeout=-1 [EOK]
context=13;subtype=WIN_TRANSACT;parts={
  subtype=WIN_SET_PROPERTY;win-14;SCREEN_PROPERTY_VISIBLE=1 [EOK]
  subtype=WIN_SET_PROPERTY;win-14;SCREEN_PROPERTY_ZORDER=1000 [EOK]
  subtype=WIN_SET_PROPERTY;win-17;SCREEN_PROPERTY_VISIBLE=0 [EOK]
  subtype=WIN_SET_PROPERTY;win-17;SCREEN_PROPERTY_ZORDER=0 [EOK]
  subtype=WIN_SET_PROPERTY;win-20;SCREEN_PROPERTY_VISIBLE=0 [EOK]
  subtype=WIN_SET_PROPERTY;win-20;SCREEN_PROPERTY_ZORDER=0 [EOK]
  subtype=WIN_FLUSH;blits=0;wait_idle=0 [EOK]
} [EOK]
context=13;subtype=WIN_GET_EVENT;timeout=-1 [EOK]
context=13;subtype=WIN_TRANSACT;parts={
  subtype=WIN_GET_PROPERTY;dpy-1;pname=SCREEN_PROPERTY_FOCUS [EOK]
} [EOK]
context=6;subtype=WIN_GET_EVENT;timeout=-1 [EOK]
context=10;subtype=WIN_TRANSACT;parts={
  subtype=WIN_GET_PROPERTY;ssn-51;pname=SCREEN_PROPERTY_MODE [EOK]
} [EOK]
            

All the requests that come from all the different contexts currently connected to Screen are shown in /dev/screen/requests. The context that makes a request is identified before each request (e.g., context=15;subtype=WIN_GET_EVENT;). You'll see that some requests go out on their own while others are items in an atomic transaction. This is because some Screen API functions are immediate, delayed, or flushing. See "Function execution types". You'll also see that for each request, you can get an error status (e.g., [EOK]). If a client process is complaining that it's getting errors from its Screen API calls, you can certainly refer to this error status to figure out the offending function call.

If you spot an error in one of the requests, you'll know which request caused the error. However, the contents of /dev/screen/requests will not necessarily tell you why there was an error. In many cases you can figure it out just by looking at the request. In other cases, you'll have to dig some more.

If you're looking for a request that's not there, most likely the application hasn't made the call, or it's deferred and waiting to be flushed. You'd be surprised how often this is the case. When debugging an error in your application, it's a good idea to call screen_flush_context() after any API function which is of the type: delayed execution. Calling screen_flush_context() helps you to determine the exact function call that caused the error.

Context

Client errors are not logged in the system log; they're returned to the clients. You can check the last error that was reported to a context by looking at the corresponding context file under /dev/screen. This is also a binary file, so you'll need to use the screeninfo utility to make sense of it:

# screeninfo /dev/screen/pid/ctx-context-id
            

where pid is the application's process ID and context-id is the ID of the context where the error occurred.

Here's what you could see:

ctx-7/6799484(winmgr)
=================================================
id = screen-ctx-7-00000025-b18c4795d6b74be7bd5cec2fa6b999e6
id string = 
permissions = rwxrwxrwx--x--x--x--x---
display = dpy-1
status = CREATED
process group id = 6799484
user id = 0
group id = 0
inject allowed = 1
flags = SCREEN_WINDOW_MANAGER_CONTEXT SCREEN_INPUT_PROVIDER_CONTEXT SCREEN_DISPLAY_MANAGER_CONTEXT
queue size = 100
last activity = 1429793855189
touch focus =
error log =
In transaction item 1 out of 1:
  screen_get_window_property(win-16, pname=44) [48]
    The request is currently not supported, or not supported for that 
    object type, or for that object in its current state.
            

From here, we can see that the request screen_get_window_property_iv(win-16, SCREEN_PROPERTY_STRIDE, ...) caused a ENOTSUP error. This error happened because the application was trying get a property that didn't belong to a window; windows don't have strides, buffers do.

From a context file, you can also see the list of events that have already been sent to the context. For example:

# screeninfo /dev/screen/1495071/ctx-2
ctx-2/1495071(sw-vsync)
=================================================
id = screen-ctx-2-00000010-e43907dd7f89e872b5c43c8849526d70
id string =
permissions = rwx--x--x--x--x--x--x---
acl = (empty)
display = dpy-2
status = CREATED
process group id = 1495071
user id = 0
group id = 0
inject allowed = 1
flags = SCREEN_APPLICATION_CONTEXT
queue size = 20
last activity = 1450383443524
touch focus =
SCREEN_EVENT_PROPERTY;timestamp=1450383175349;win-0;pname=SCREEN_PROPERTY_FOCUS
SCREEN_EVENT_PROPERTY;timestamp=1450383175349;win-0;pname=SCREEN_PROPERTY_STATUS
SCREEN_EVENT_DEVICE;timestamp=1450383585914;dev-0;attached
SCREEN_EVENT_DEVICE;timestamp=1450383637260;dev-0;detached
SCREEN_EVENT_DEVICE;timestamp=1450383638457;dev-0;attached
SCREEN_EVENT_DEVICE;timestamp=1450383638700;dev-0;detached
            

Windows

If it looks like it's the contents of a window that are wrong, there are ways to inspect the content of each buffer of every window. This service is provided as a BMP file under the folder that exists for each pixmap or stream that was created by each context.

Find the window that contains the error, and then find the stream that the window is using. Once you find the stream, have a look at its front buffer. First you have to figure out which is the front buffer for that window. To do this, you need to see the contents of the text file associated with the window.

The win-window_id file in /dev/screen/pid/win-window_id shows the values of the window parameters that Screen's currently using. When you have an application running, Screen creates a directory for each application window, and each of these directories contains files associated with the window.

For example, for process 1495071 with only one application window, you can use this command from your shell:

# cat /dev/screen/1495071/win-0
            

Here's part of the file you may see:

win-0/ctx-2/1495071(sw-vsync)
=================================================
id = screen-win-0-00000011-dd6ba1aeacf03bc1755f9950f5aaad1c
id string = sw-vsync
permissions = rwx--x--x--x--x--x--x---
acl = (empty)
display = dpy-2
status = FULLY VISIBLE
type = SCREEN_APPLICATION_WINDOW
autonomous = 0
references = 1
window manager string =
pipeline id = 1
parent = (none)
children = (none)
window above = (none)
window below = (none)
alternate window = (none)
root window = (none)
shadow = (none)
reclip = 0
updates = 0
locked = 0
valid = 0x00000018
class =
flags = WIN_FLAG_VISIBLE WIN_FLAG_FLOATING
buffer size = 1280x720
format = SCREEN_FORMAT_RGBX8888
color space = SCREEN_COLOR_SPACE_UNCORRECTED
usage = SCREEN_USAGE_NATIVE
order = 0
swap interval = 1
default = ssn-0
sessions = *ssn-0
streams = str-0
holes = (none)
...
            

From the above window example, you can see that the window has a stream, str-0. You can look under the str-0 directory to see associated information on the stream. For example:

# ls /dev/screen/1495071/str-0
str-0            str-0-1.bmp
# cat /dev/screen/1495071/str-0/str-0
str-0/ctx-2/1495071(sw-vsync)
=================================================
id = screen-str-0-00000013-4ce1a81eab75c607b80e317b90e26509
id string =
permissions = rwx--x--x--x--x--x--x---
acl = (empty)
display = dpy-2
status = INVISIBLE
autonomous = 0
mode = (SCREEN_STREAM_MODE_FIFO)
buffer count = 1
flags = (none)
buffer size = 1280x720
format = SCREEN_FORMAT_RGBX8888
color space = SCREEN_COLOR_SPACE_UNCORRECTED
usage = SCREEN_USAGE_NATIVE
swap interval = 1
protection requested = (none)
protection enabled = (none)
owner = win-0/ctx-2/1495071(sw-vsync)
refs = (self)
front = 0
index=0 flags=0x0001 dirty=(880,0;881,720)
back = [ 0 ]
rcvids = (none)
buffers =
str-0-1 gd=0 nbytes=4194304 usage=0x0008 flags=0x8240 paddr=0x00000000 vaddr=0x40400000 cvaddr=0x00000000 dvaddr=0x00000001 wfd=0x00000000 cm=0x08893350 egl=0x00000000 native=0x080a95b0
metrics.cpu = 8589934590
metrics.gpu = 0
metrics.objcnt = 0
metrics.apicnt = 0
metrics.drawcnt = 0
metrics.tricnt = 0
metrics.vtxcnt = 0
metrics.teximg = 0
metrics.subdata = 0
metrics.blits.count = 23588
metrics.blits.pixels = 19 Mpixels
metrics.blits.reads = 0 bytes
metrics.blits.writes = 76 Mbytes
metrics.posts.count = 11794
metrics.posts.pixels = 0 pixels
           

The above example tells us the stream's front buffer is index 0. The image to look for is one called str-0-1.bmp. It's important to keep in mind that the system is still running while you do all these operations. You can check the front buffer and then moments later it will be different. That's why it's a good idea to copy all of them. It's also possible for an application to render into the buffer while you're reading or copying it, which can cause tearing artifacts. Therefore, looking at the bitmaps isn't the best way to investigate tearing on the display.

If there are no issues with the buffer contents, or the problem isn't the contents but the position of the window on the screen, or it's the visibility, you can check the display file that we looked at before in the "At initialization" section. Pay special attention to the scene description to try to figure out what might be wrong.

dpy-2/ctx-0/0(screen)
=================================================
...
modifiers = (none)
scene =
  win-2   ctx-3   ssn-1   pid=1622052(sw-vsync)             order=0          FULLY VISIBLE sw-vsync
  win-0   ctx-2   ssn-0   pid=1609762(gles2-gears)          order=0          FULLY VISIBLE gles2-gears

strategy = WIN_RENDER_CM
bypass = 0
dirty = (none)
windows = win-0 win-2
sessions =
 *ssn-1   win-2   ctx-3   pid=1622052(sw-vsync)              type=NONE     order=0      mode=default    SCREEN_SENSITIVITY_TEST
 *ssn-0   win-0   ctx-2   pid=1609762(gles2-gears)           type=NONE     order=0      mode=default    SCREEN_SENSITIVITY_TEST
background = (none)
holes = (none)
...

            

The scene tells you all the windows that are marked as visible on the display. In the above example, there are 2 visible windows on the display (two applications displaying one window each). Windows that aren't realized or haven't posted a front buffer yet aren't in the scene. The list is sorted in descending order. The windows with higher z-order are listed first. The windows at the bottom of the list are lower in z-order. If your window isn't there, check that it:

Note that covered windows are part of the scene. The windows list enumerates all the windows that are composited to a framebuffer (windows = win-0 win-2). If a window isn't there, it's because it's going straight to a pipeline, or it's invisible because it's covered by another window, or the window is clipped, or the window's transparency is SCREEN_TRANSPARENCY_DISCARD. Don't forget that windows are drawn from bottom to top, so this is the order in which the windows are presented in the windows list. You can check the pipelines to see which windows might be showing on a pipeline.

Now, let's say we position the window of one of the applications (sw-vsync) to be outside the bounds of the display. Here's an example what the scene could look like:

dpy-2/ctx-0/0(screen)
=================================================
...
modifiers = (none)
scene =
    win-2   ctx-3   ssn-1   pid=1622052(sw-vsync)             order=0          CLIPPED       sw-vsync
    win-0   ctx-2   ssn-0   pid=1609762(gles2-gears)          order=0          FULLY VISIBLE gles2-gears
strategy = WIN_RENDER_CM
bypass = 0
dirty = (none)
windows = win-0
sessions =
 *ssn-1   win-2   ctx-3   pid=1622052(sw-vsync)              type=NONE     order=0      mode=default    SCREEN_SENSITIVITY_TEST
 *ssn-0   win-0   ctx-2   pid=1609762(gles2-gears)           type=NONE     order=0      mode=default    SCREEN_SENSITIVITY_TEST
background = (none)
holes = (none)
...
            

In this second example of a scene, you can see that there are still 2 windows in the scene. However, one window is clipped (i.e., not visible). The clipped window is still a valid render target, but it isn't visible on the display. Therefore, the clipped window isn't composited to the framebuffer, and that's why it isn't in the windows list (windows = win-0).

When we look at a scene, most of this state is controlled by the properties of the various windows. If something's not right, the next thing to do is go look at the relevant windows and see what their properties are. The files associated with the windows provide values for the different properties and they also provides the internal state of the windows.

Framebuffers

If everything from the windows perspective looks good, the problem may first appear in the framebuffer (i.e., the problem may be compositing into the framebuffer). You can look at the buffers of the framebuffer. Because the framebuffer is a window just like any other window, you can use all the tools and files we've discussed in the earlier sections to check the framebuffer. If the contents of the framebuffer don't look quite right, you can dig into all the operations that went into generating the contents of the framebuffer. The /dev/screen/0/blt-[id] file contains all the operations performed on the framebuffer for a display. This is a binary ring buffer, so you must use screeninfo to decipher it:

# screeninfo /dev/screen/0/blt-1
            

and here's what you could see:

4252> blit[source,nonpre,33](win-5, (0, 0, 50, 10), (0, 50, 50, 10)) [ok]
4252> blit[source,nonpre,33](win-6, (0, 0, 50, 10), (718, 50, 50, 10)) [ok]
4253> fill[255](0, 0, 768, 60, #ff000000) [ok]
4253> blit[source,premult,255](win-2, (0, 0, 768, 60), (0, 0, 768, 60)) [ok]
4253> blit[source,nonpre,33](win-5, (0, 0, 50, 10), (0, 50, 50, 10)) [ok]
4253> blit[source,nonpre,33](win-6, (0, 0, 50, 10), (718, 50, 50, 10)) [ok]
4254> fill[255](0, 0, 768, 60, #ff000000) [ok]
4254> blit[source,premult,255](win-2, (0, 0, 768, 60), (0, 0, 768, 60)) [ok]
4254> blit[source,nonpre,33](win-5, (0, 0, 50, 10), (0, 50, 50, 10)) [ok]
4254> blit[source,nonpre,33](win-6, (0, 0, 50, 10), (718, 50, 50, 10)) [ok]
4255> fill[255](0, 0, 768, 60, #ff000000) [ok]
4255> blit[source,premult,255](win-2, (0, 0, 768, 60), (0, 0, 768, 60)) [ok]
4255> blit[source,nonpre,33](win-5, (0, 0, 50, 10), (0, 50, 50, 10)) [ok]
4255> blit[source,nonpre,33](win-6, (0, 0, 50, 10), (718, 50, 50, 10)) [ok]
4256> fill[255](0, 0, 768, 60, #ff000000) [ok]
4256> blit[source,premult,255](win-2, (0, 0, 768, 60), (0, 0, 768, 60)) [ok]
4256> blit[source,nonpre,33](win-5, (0, 0, 50, 10), (0, 50, 50, 10)) [ok]
4256> blit[source,nonpre,33](win-6, (0, 0, 50, 10), (718, 50, 50, 10)) [ok]
4257> fill[255](0, 0, 768, 60, #ff000000) [ok]
4257> blit[source,premult,255](win-2, (0, 0, 768, 60), (0, 0, 768, 60)) [ok]
4257> blit[source,nonpre,33](win-5, (0, 0, 50, 10), (0, 50, 50, 10)) [ok]
4257> blit[source,nonpre,33](win-6, (0, 0, 50, 10), (718, 50, 50, 10)) [ok]
4258> fill[255](0, 0, 768, 60, #ff000000) [ok]
4258> blit[source,premult,255](win-2, (0, 0, 768, 60), (0, 0, 768, 60)) [ok]
4258> blit[source,nonpre,33](win-5, (0, 0, 50, 10), (0, 50, 50, 10)) [ok]
            

Each line starts with a frame number, so you can see which series of operations belong to the same update. Just like for the requests, there is an error status. In some rare cases, you'll see an error, which explains why the display isn't showing the right thing. In most cases, errors in this file will also have corresponding messages in the system log, which may provide additional information.

Similar to the requests file, this file can be disabled, so you must ensure that graphics.conf doesn't set its size to zero if you need it. Normally, a size of 4 KB to 8 KB is sufficient to capture several hundreds of operations. If you need to look further back in time, you can make it bigger. See Configure globals subsection of Configuring Screen for more details.

Composition time

You can view the composition time for each display. When composition is active, its duration is averaged over the last 1, 2, and 5 seconds. It can be queried in the /dev/screen/0/dpy-* files. For example, for the display dpy-2, your output may look like this:

# cat /dev/screen/0/dpy-2 | grep ms
vendor = QNX Software Systems
composition duration = 1.9 ms (average over last 1 s)
                       1.9 ms (average over last 3 s)
                       1.8 ms (average over last 5 s)                
            

In some cases, composition is bypassed. If composition is not active, your output will indicate that it is not available. See the Composition section for more details.

OpenGL function calls

You can use the gltracelogger utility to capture a log of OpenGL function calls made by your Screen applications. You can start this utility to capture the trace at any time. See gltracelogger.

Once you've captured a log of OpenGL function calls, use the gltraceprinter utility to print the trace file to console or to file for analysis. See gltraceprinter.

apitrace

You can also use apitrace, a set of open-source tools, to diretly analyze the trace files (output from gltracelogger). The set of tools that apitrace provides can be used on various hosts (e.g., Windows, Linux, or MacOS). See https://github.com/apitrace/apitrace for more information.

Before you can use apitrace's tools to analyze your trace files, you must first capture a trace file that includes all OpenGL calls from the launch of your Screen application (i.e., from before the application renders the first frame). To start tracing the application from the beginning, you'll need to set the GLTRACE_WAIT_FOR_CLIENT environment variable that forces the application to wait for the gltracelogger to start. See gltracelogger for more information on how to set this variable.

Although apitrace tools can't be used to capture any tracing (e.g., apitrace trace, export APITRACE_BACKTRACE), they can be used to analyze an existing trace file that you've captured with gltracelogger. These other tools you can use for analysis include:

See https://github.com/apitrace/apitrace/blob/master/docs/USAGE.markdown for more information.

Global configuration

You can check the configuration of the globals subsection of the winmgr section at anytime. The configuration parameters and their associated values are stored in the /dev/screen/0/globals file. The file will always contain the most recent configuration, even if the values were updated. For example, your configuration may look like this:

# cat /dev/screen/0/globals                              
=================================================
alloc-config =
blit-config = vivante
blit-defer = True
blits[0]-logsize = 4080
default-display = 2
gesture-logsize = (none)
idle-timeout = (none)
input = keyboard mouse gamepad joystick
input-logsize = 8K
joystick-logsize = (none)
keyboard-logsize = (none)
keymap = /usr/share/keyboard/en_US_101
pointer-logsize = (none)
queue-start_size = 200
requests-logsize = 64K
shortcuts = (none)
stack-size = (none)
touch-logsize = (none)
winmgr-mode = (none)