Application Profiler

Application Profiler is a QNX tool that allows you to view the profiling results generated by instrumented binaries that run on a QNX Neutrino target. The results tell you who called each function as well as how often, how long was spent in each function, and how much CPU time individual lines of codes used. This helps you locate inefficient areas in your code without following its execution line by line.

How to configure Application Profiler

When the launch mode is Profile, the Application Profiler is enabled by default. For other launch modes, you can enable the tool manually through the launch configuration Tools tab (Run, Debug, or Attach modes) or Memory tab (Memory mode). It isn't supported by the Coverage or Check modes.

The tool settings are organized into the following four panels:
  • Profiling Method
  • Profiling Scope
  • Options
  • Control

Profiling Method

Functions Instrumentation
Provides precise function runtimes, based on function entry and exit timestamps reported by the binary. The binary must be compiled and linked with the options described in Enabling function instrumentation.
Sampling and Call Count Instrumentation
Provides runtime estimates based on statistical position sampling driven by timer interrupts. Your application doesn't need to be recompiled but must run for a long time for the statistics to be accurate. Also, backtrace information (call stacks) may not be available. To see call counts, the binary must be compiled and linked with the option described in Enabling call count instrumentation.

Profiling Scope

Single Application
Allows you to profile a specific process for an extended period of time, but doesn't provide information about context switches.
System Wide
Generates profiling events as kernel log events so that later you can use the System Profiler tool to navigate the data. When this option is enabled, the IDE initiates a kernel event trace when launching the application. The trace captures kernel activity from all processes on the target. Any kernel event trace should be limited to a few seconds because longer traces produce too much data to be useful. When this option is enabled, you must also enable the System Profiler by checking its box at the bottom of the launch configuration tab.
This option is available only when the Profiling Method is Functions Instrumentation.

Options

These options are available only when the Profiling Method is Functions Instrumentation and the Profiling Scope is Single Application.

Data transfer method
  • Save on the target, then upload — Saves all profiling data on the target then uploads the data file to the host when the program finishes. This setting means you can't see results as they come in, but avoids the overhead of sending data over the network while the program runs.
  • Upload while running — Sends profiling data to the host while the process runs on the target. This setting lets you see results as they come in, without waiting for the program to finish, but imposes more overhead.
Path on target for profiler trace
Specifies the location of the profiling results file on the target. You can include the string ${random}, which the IDE replaces with a random number. The IDE can perform this replacement in multiple, simultaneous sessions.
You can enter either an absolute path that includes the filename or a relative path. If you specify a relative path, the file is written to /tmp.
Remove on exit
Removes the profiling results file from the target when the session is complete. If the IDE has substituted a random number for the string ${random} in the filename, the file is permanently removed.
Use pipe
Creates a pipe file on the target instead of a regular file. A pipe file stores the collected data using less memory and disk space. To use this option, ensure that the pipe utility is running on the target. For more information, see pipe in the Utilities Reference.
The IDE can create the file on the real filesystem only. It can't create it on /dev/shmem/.
Profiling Counter
Specifies which counter the instrumentation code uses to determine the current time at function entry or exit points.
  • Time: Default — Use ClockCycles() for single-core and realtime clock for multi-core (default).
  • Time: Clock Cycles — Use ClockCycles() for multi-core, which is faster and has better resolution than the realtime clock. Requires threads to be bound to the same CPU.
  • Time: Clock Monotonic — Use the realtime clock, which reports the monotonic count of elapsed time since the system booted.
  • Time: Clock Process — Use the process time clock.
  • Time: Clock Thread — Use the thread time clock.
  • Memory: Allocated Heap — Use allocated heap memory as a profiling counter (enables memory profiling).

Control

These options are available only when the Profiling Method is Functions Instrumentation.

Automatically start profiling
Specifies whether profiling starts automatically or only after you click Resume Profiling in the Analysis Sessions view. If you uncheck this box, you must define signals for starting and stopping profiling by checking the box just underneath. Otherwise, no profiling results are captured.
Install start/stop hooks
Enables the Resume Profiling and Pause Profiling controls in the Analysis Sessions view. If your code already uses the default signal numbers that start and stop profiling, you can specify different values.
  • Pause signal number — Specifies the signal number that pauses profiling.
  • Resume signal number — Specifies the signal number that starts or resumes profiling.
Note: Below the four panels, the Configure Shared Libraries Paths link takes you to the Libraries tab. Use this tab to define the path of any shared library for which you want to see symbol information in the Application Profiler results.

How Application Profiler results are presented

When you launch an application with the Application Profiler enabled, the IDE switches to the QNX Analysis perspective and opens the Execution Time view, which shows the application's execution time by function. In the Analysis Sessions view, the IDE creates a session for storing the profiling results.

Analysis Sessions view

This view displays all sessions for any analysis tool run from the IDE using the current workspace. Each open profiling session has a header that contains the tool's open session icon (Icon: Open Application profiling session), the binary name, the session number, and the launch time.

Select the header to display function information from all components in the Execution Time view. To filter the results, expand the header and select the appropriate item. For example, to display results for functions in your program code but not in any libraries that it uses, click the application binary.

The view toolbar provides Resume Profiling (Icon: Resume profiling session) and Pause Profiling (Icon: Pause profiling session) controls that allow you to resume and pause profiling. You can specify signals to act as start and stop hooks for these controls through the Control tab.

The Take Snapshot (Icon: Take profiling session snapshot) button captures the current data without stopping the profiling activity. You can then compare the snapshot data with the final results or other snapshots from the same session. For details on comparing results from distinct sessions, see Comparing profiling session results.

Execution Time view

This view displays the results of profiling sessions, using a table. The table has these default columns:
  • Name — The name of the function.
  • Deep Time — The time it took to execute the function and all of its descendants. This metric is also referred to as Total Function Time. It is the pure realtime interval from when the function started until it ended, which includes its shallow time, the sum of its children's deep times, and all time in which the thread wasn't running while blocked inside of it. If the function is called more than once, this column contains the sum of all runtimes when it's called from a particular stack frame or parent function.

    For Sampling and Call Count Instrumentation mode, this column isn't used.

  • Shallow Time — For Function Instrumentation mode, this column is the deep time minus the sum of the children runtimes. It roughly represents the time spent in this function only. However, it also includes the time for kernel and instrumented library calls and for profiling the code.

    For Sampling and Call Count Instrumentation mode, it's an estimated time, calculated by multiplying an interval time by the count of all samples from this function.

  • Count — The number of times the function was called.
  • Location — The location of the function in the code.
To add or remove columns, click the View Menu dropdown (Icon: Open Application profiling session) in the view toolbar, then click Preferences. You can add or remove any of the following columns:
  • Percent — The percentage of Deep Time compared to the Total Time (or compared to the Root Node Time).
  • Average — The average time spent in the function.
  • Max — The maximum time spent in the function.
  • Min — The minimum time spent in the function.
  • Time Stamp — A timestamp indicating the last time the function was called, if present.
  • Binary — The binary filename.
Note: You can right-click in any table row and from the context menu, show who the current function called, who called it, its place in the program's call graph, and a tree map illustrating the relative time spent inside of it, as well as navigate to the function in the source code. For more information, see “Viewing call information”.
The Go Back and Go Forward toolbar buttons let you move between the context menu action results and the original display. The other actions supported by the toolbar include but aren't limited to:
  • Disable Automatic Refresh (Icon: Disable Automatic Refresh) — Pause the data display in its current state until you unlock it.
  • Refresh (Icon: Refresh Now) — Refresh the data display.
  • Take Snapshot and Watch Difference (Icon: Take Snapshot) — Create a profiling session that is a snapshot of the current data. Later, you can compare the results of two profiling sessions to see the effects of any changes you made between execution runs.
  • Show Threads Tree (Icon: Show Threads Tree) — Display a graphical representation of the threads and calling functions within your application. This display helps you see exactly where the application spends its time and which functions are most used. You can drill down to see details of the lowest function calls.

    Screenshot of Execution Time view showing sampling-based function runtimes in a thread-based tree

    Information about individual threads is not available for postmortem profiling. Instead, only one thread (with all time assigned to it) is displayed.

  • Show Table (Icon: Show Table) — Display a list of functions within your application. This display helps you identify which functions take the longest to complete. In Functions Instrumentation mode, calls to C library functions such as printf() are not displayed.

    Screenshot of Execution Time view showing precise deep and shallow function runtimes in a thread-based tree

Annotated source

If your executable binary has debug information and your host has the source file for a particular function, you can double-click its entry in the Execution Time view to open an annotated version of the code in the editor. The editor shows solid green and graduated blue-yellow bars in the left margin.

Screenshot of the annotated source editor showing bars in the left margin to represent the times spent executing a function or individual lines
CAUTION:
You may receive incorrect profiling data if you changed the source since the last time you compiled. This is because the annotated editor relies on the line information provided by the debug version of your binary.
The length of the bar represents the percentage of total execution time. The color represents details specific to the function or single line, as follows:
Blue-Yellow
For the first line of a function, a blue-yellow bar shows its total runtime. This is the time it took to execute the function and all of its descendants. It includes the function's shallow time, the sum of its children's deep times, and all time in which the thread wasn't running while blocked inside of it.
Green
For a line of code, a green bar shows the amount of time for the inline sampling or call-pair data. The lengths of the green bars within a function add up to the length of the blue-yellow bar on its first line.
To view quantitative profiling data, hover the pointer over a colored bar. The resulting tooltip shows the total number of milliseconds and total percentage of execution time for that code. For children, the tooltip shows the percentage of time relative to the parent.

Screenshot of the annotated source editor showing tooltip information when the pointer is hovered over a child function

Viewing call information

You can right-click a table row and then click a menu option to show more information about the current function. The menu includes the following options:

Show Calls

To view a list of all of the functions called by a selected function, right-click a function in the Execution Time view, then click Show Calls.

This call tree view lets you drill into specific call traces to analyze which ones have the greatest performance impact. You can set the starting point of the call tree view by drilling down from a thread entry function to see how the actual time is distributed for each of its function descendants.

For a description of the columns that the call tree view provides, see “Execution Time view”.

Show Reverse Calls

To view what is calling a specific function and how its time was distributed for each of those callers, right-click a function in the Execution Time view, then click Show Reverse Calls. You can use a reverse call tree to either drill up or down the stack to view the callers and their contribution time, until you encounter a thread entry function.

Show Call Graph

A call graph shows a visual representation of how the functions are called within the project.

To create a call graph for the selected profile, in the Execution Time view, right-click a function, then click Show Call Graphs.

This call graph shows a pictorial representation of the function calls. The selected function appears in the middle, in blue. On the left, in orange, are all of the functions that called this function. On the right, also in orange, are all of the functions that this function called.

To see the calls to and from a function, click a function directly in the call graph.

If you position your cursor over a function in the graph, Deep Time, Percent, and Count information is displayed for that function, if available.

Note:

You can show the call graph only for functions that were compiled with profiling enabled.

Comparing profiling session results

The comparison mode displays the differences between two profile sessions to allow you to see the results of changes made to your application. In comparison mode, the values that Execution Time view displays are time differences instead of absolute time.

To compare two sessions, in the Analysis Sessions view, select the sessions, right-click, and then click Compare.

For example, you can compare two profiles to evaluate results before and after you have optimized a function. In comparison mode, each column shows the change in values from the older session to the newer one (new value minus the old value). If there's no new value match for an item, its old value is used.

Screenshot of Execution Time view showing sampling-based function runtimes in a thread-based tree
  • Icon: Compare time decreased — Value has decreased.
  • Icon: Compare time increased — Value has decreased.
  • Icon: Compare item in first session only — Item is present in the first session only.
  • Icon: Compare item in second session only — Item is present in the second session only.

To hide insignificant results (for example, <1% of difference) or apply other filters, click the View Menu dropdown (Icon: Open Application profiling session) in the view toolbar, then click Filters.

Filtering results

Use the following methods to view more information: