Mapping and isolating client CPU load from server CPU load

There are many cases where high CPU load is traced back to server activity. However, in most cases what is required to reduce this CPU load isn't to make the servers more efficient, but to look more closely at the client activity that is causing the servers to run.

Requirements

Note: Make sure you've read and understood Locating sources of high CPU usage before examining this use case.

In addition to the Neutrino RUNNING thread state, the log must contain the communication events SEND/RECEIVE/REPLY|ERROR. These communication events are used to establish the relationship between clients and servers.

Procedure

QNX Neutrino systems rely heavily on client/server communications most often performed via message passing. In these systems, clients send requests to servers asking them to do work on their behalf such as shown:

System Profiler: servers and clients

Here, A's real CPU usage would be considered to be 2 units of time, B's as 10, and C's as 2 units of time. Since B and C are both acting as servers, they really execute only when there are clients generating requests for action. Most standard CPU Usage metrics don't take this type of on behalf of work into consideration. However, if the goal of a kernel log file investigation is to locate the source or sources of CPU load, then this type of metric is invaluable in assigning blame for high CPU usage.

The System Profiler provides the Client/Server CPU Statistics view to help extract this type of on behalf of metric. You can activate this view via the Window > Show View > Other… > Client/Server CPU Statistics .

Once activated, the Client/Server CPU Statistics are gathered on demand, by default, targeting the full range of the target log file:

System Profiler: Gather statistics

The default display of this view shows the simplified view that displays the RUNNING time (slightly different from the CPU Usage in that it doesn't remove the time spent interrupted by interrupt handlers) that CPU consumers are consuming directly, indirectly, and summed together as a total:

System Profiler: RUNNING time

In this case, it's clear that while the qconn- Thread 1 isn't consuming much CPU on its own, it's imposing a significant amount of time on the system indirectly. If you compare this data to what the CPU Usage editor pane displays, you'll see the difference in what's reported:

System Profiler: qconn Thread 1

In the CPU Usage table, procnto- Thread 8 ranks ahead of qconn- Thread 1 in its usage. However, proncto is a pure server process, so we know that it consumes no CPU resources without being solicited to do so. We suspect that perhaps qconn- Thread 1 is driving procnto- Thread 1.

We can confirm this suspicion by looking at which servers qconn- Thread 1 is imposing CPU usage on. You can configure the Client/Server CPU Usage view to display all of the CPU consumers that are being imposed on (and by whom) by selecting Show all times from the view's dropdown menu:

System Profiler: Show all times

The Client/Server CPU Usage view table changes to show all of the imposed-on servers that clients are communicating with. The servers are listed in the columns and the clients in the Owner column. Note that this may result in a table with many columns (imposed on servers):

System Profiler: imposed-on servers

Here we can see that in fact nearly all of the time that procnto- Thread 8 is spending consuming CPU is due to requests coming from qconn- Thread 1, with only a minimal amount being imposed on it by another qconn thread, qconn- Thread 6.

This is to be expected, since in order to query the system, the qconn process must communicate with the kernel to extract the system state and all the process and thread information.