Caution: This version of this document is no longer maintained. For the latest documentation, see http://www.qnx.com/developers/docs.

Profiling an Application

You can select a topic from this diagram:

What's New Glossary Getting Started Utilities Used by the IDE Getting System Information Using Code Coverage Common Wizards Reference Preparing Your Target Developing Photon Applications Developing C/C++ Programs Where Files Are Stored Building OS and Flash Images Migrating to the 6.3 Release Tutorials IDE Concepts About This Guide Analyzing Your System With Kernel Tracing Profiling an Application Finding Memory Errors Debugging Programs Managing Source Code Launch Configurations Reference

Workflow diagram with profiler chapter highlighted


This chapter shows you how to use the Application Profiler tool.

In this chapter:

Introducing the Application Profiler

The QNX Application Profiler perspective lets you examine the overall performance of programs, no matter how large or complex, without following the source one line at a time. Where a debugger helps you find errors in your code, the QNX Application Profiler helps you pinpoint inefficient areas of your code that could run more efficiently.


QNX Application Profiler perspective


The QNX Application Profiler perspective.

By default, the Application Profiler perspective includes these main views:

Types of profiling

The QNX Application Profiler lets you perform:

Sampling doesn't require instrumentation, and has low overhead, but your application needs to run for a long time for you to get sound data.

Sampling and Calls Countrequires a compiler and linker flag, and has more overhead.

Function Instrumentation requires a compiler flag and linker flag, and even more overhead.

Statistical sample profiling (sampling)

The QNX Application Profiler takes "snapshots" of your program's execution position every millisecond and records the current address being executed. By sampling the execution position at regular intervals, the profiling tool quickly builds a summary of where the system is spending its time in your code.

With statistical sample profiling, you don't need to use instrumentation, change your code, or to perform any special compilation. The profiling tool profiles your programs unobtrusively, so that it doesn't bias the information it's collecting.


Note: The results are subject to statistical inaccuracy because the profiling tool works by sampling. Therefore, the longer a program runs, the more accurate the results.

Function Instrumentation profiling

This method provides you with precise function run time information for your project. It performs better on one thread, because with many threads, the overhead of such measurement can change the application's behavior.

To enable instrumentation, compile each source file with the option -finstrument-functions. This gcc option instructs the compiler to generate a call to the profiling function just after the entrance to, and just before the exit from every application function, which permits the collection of profiling information. Profiling functions are defined in the libprofilingS.a library; to access these, link the binary or library with the -lprofilingS option.


Note:

For an application that intends to use an instrumented library as a DLL (i.e. using a dlopen() call), compile the library with the -Wl,E option.


Sampling and Call Count instrumentation profiling

This type of profiling is a combination of sampling mode and Call Count instrumentation data, and it provides per line statistical coverage (as well as a call graph at the same time), with relatively small overhead.

To instrument a binary or library in this mode, use the -p option for both compiling and linking. The -p option for the compiler prepares the binary for profiling (the compiler will then insert code before each function to gather call information); however, it won't cause the profiling versions of the libraries to be linked in. To link in the profiling versions from the libc library, use the -p option for the linker.

If you compile and link with either the -pg or -p option, when the executable program runs, either gprof or prof monitors the program and produces a report file called gmon.out. The gprof utility can't report information about program calls to routines from a precompiled library (such as libc) that weren't compiled with the -pg option. Consequently, the resulting profiling information won't include data about calls made to those routines (for example printf()).

If most of the execution time occurs in various library routines, then this fact will likely reduce the value of the profiling results, since there is no indication in the results of where the call was made. In this case, you can use Function Instrumentation profiling, which causes this additional time to be charged to the higher-level routine that called the library function.

Postmortem profiling for Call Count and Function Instrumentation profiling

The IDE lets you examine profiling information from an output file produced by an instrumented application (i.e. gmon.out). The tool provides you with all of the information collected at runtime, but in a graphical format.

Postmortem profiling supports data generated by gprof (gmon.out), the QNX profiler library (.ptrace), and the trace logger (.kev).

For more information about the gprof utility, go to www.gnu.org; for qcc, see the Utilities Reference.

Profiling your programs

Whether you plan to do profiling in real time or postmortem, you'll need to build your programs with profiling enabled before starting a profiling session (for Instrumented profiling).

This section includes these topics:


Note: If you already have a gmon.out, .kev, or .ptrace file, you're ready to start a postmortem profiling session.

Building a program for profiling

Although you can profile any program, you'll get the most useful results by profiling executables built for debugging and profiling. The debug information lets the IDE correlate executable code and individual lines of source; the profiling information reports call graph data or precise function time measurements.


Note:

Sampling and Call Count profiling is handled by functions in libc; Function Instrumentation profiling is handled by functions in libprofilingS.a; occasionally check our website for any updates to these libraries.


Profiling features associated with build variants

This table shows the Application Profiling features supported with the various profiling modes:

Feature Sampling Sampling and Call Count Function-Instrumentation
Own Function Time Yes Yes Yes
Thread Time Yes Yes Yes
Start/Stop Profiling Yes Yes Yes
Source Location (if compiled with debug) Yes Yes Yes
Line level editor annotations Yes Yes No
Function calls editor annotations No No Yes
Thread tree mode Yes Yes Yes
Table mode Yes Yes Yes
Call graph mode No Yes Yes
Who calls/Who called No Yes Yes
Calls Count No Yes Yes
No recompile Yes No No
Function backtrace No No Yes
Deep Function time (own + descendants) No No Yes
Timed stack tree No No Yes
Max/Min Time No No Yes

Building with profiling enabled

For an existing project, when you build your project to profile an application to capture performance information, profiling can provide you with decision-making capabilities to help discover functions that consume the most CPU time. However, to instrument your code, you'll need to change the existing configuration options so that you can build your project with profiling enabled. The IDE will then insert code before each function to gather call information (Call Count instrumentation) or just after the function enters, and just before the function exits (Function Instrumentation).

To configure profiling for the selected project, depending on your type of project, do one of the following:

Running and profiling a process

To run and profile a process, with qconn on the target:

  1. Create a QNX Application launch configuration for an executable with debug information as you normally would, but don't click OK. You may choose either a Run or a Debug session.

    Note: Debug mode isn't recommend for running Function Instrumentation mode, because it can skew the profiling data results.

  2. In your launch configuration, click the Tools tab.
  3. Click Add/Delete Tool.... The Select tools to support dialog appears.
  4. Select the Application Profiler tool.
  5. Click OK.
  6. In the Application Profiler mode, select your profiler method, profiler mode, and other options, if applicable.

    Note:

    To run in Sampling mode, select Sampling and Call Count Instrumentation; to run in Sampling and Call Count mode, select Sampling and Call Count Instrumentation; to run in Function Instrumentation mode, select Function Instrumentation and Single Application.

    For descriptions about these options, see "Application Profiler tab."


  7. If you want the IDE to automatically change to the QNX Application Profiler perspective when you run or debug, check the Switch to this tool's perspective on launch box.
  8. Click Apply.
  9. Click Run or Debug.

    The IDE starts your program and begins to profile it.

To produce full profiling information with function timing data, you need to run the application as root; this is required when running through qconn.

If you run the application as a normal user, the Application Profiler tool can generate only call-chain information.

You have to specify the Shared library path in two locations: use the Uploads tab in the launch configuration if libraries have to be uploaded every time an application runs, and use the Shared Libraries tab on the Tools tab to specify the host location of libraries so that the IDE can read their debug symbols to show their symbol information.

Since the dynamic library isn't included with the IDE, there is an issue caused by the static linkage of the profiling library. To solve this problem, you'll need to do the following:

Profiling a running process

You can run a process on the target (without the IDE) and collect the profiling information while it's running. In order to collect profiling information, you have to modify the way you normally launch your application by adding environment variables:


Note: If you're launching using the IDE, you can specify the environment variables on the Environment tab in the launch configuration.

To profile a process that's already running on your target:


Note: When you profile a running process, you can't use the Console view in the IDE to interact with this process. If your running process requires user input through the Console view, use a shell to interact with the process.

  1. While the application is running, open the Launch Configurations dialog by choosing Run-->Profile... from the menu.
  2. Select C/C++ QNX Attach to Remote Process via QConn (IP) from the list on the left.
  3. Click the New button to create a new attach-to-process configuration.
  4. Configure things as you normally would for launching the application with debugging.
  5. On the Tools tab, click Add/Delete Tool.... The Tools Selection dialog is displayed.
  6. Select the Application Profiler tool, then click OK. On the launcher, the Application Profiler tab is displayed.

    Launcher; tools tab;  Profiler

    For descriptions about the options, see "Application Profiler tab."

  7. If you're using Function Instrumentation, make sure that the value in the Path on target for profiler trace field matches the value of QPROF_FILE that you used to run the application.
  8. Select Switch to this tool's perspective on launch.
  9. Optional: In the launcher, click the Shared Libraries tab.

    The IDE doesn't know the location of your shared library paths, so you must specify the directory containing any libraries that you wish to profile. For a list of the library paths that are automatically included in the search path, see the appendix Where Files Are Stored.

  10. Click Apply, and then click Run. The Select Process dialog shows all of the currently running processes:

    Select Process dialog

  11. Select the process you want to profile, and then click OK.

Postmortem profiling for Call Count and sampling

Postmortem profiling lets you profile your application (the data generated by the profiling process) at a later time. The IDE lets you profile your program after it terminates, using the traditional gmon.out file; however, postmortem profiling doesn't provide as much information as profiling a running process because:

Profiling a gmon.out file involves these basic steps:

Gathering profiling information

To gather profiling information in a gmon.out file, you need to specify the PROFDIR environment variable before launching your application.

If you're launching from the command line, type the following:

PROFDIR=/tmp ./appname

To launch from IDE:

  1. Create a launch configuration for a debuggable executable as you normally would, but don't click Run or Debug.

    Note: You must have the QNX Application Profiler tool disabled in your launch configuration.

  2. Click the Tools tab and deselect the Application Profiler tool, and click OK.
  3. Select the Environment tab.
  4. Click New.
  5. In the Name field, type PROFDIR.
  6. In the Value field, enter a valid path to a directory on your target machine.

    Note: This path must be a valid location on the target machine; otherwise, you'll receive a warning message indicating that the IDE was unable to open the gmon.out file for output.

  7. Click OK.
  8. Run your program. When your program exits successfully, it creates a new file in the directory you specified. The filename format is pid.fileName (e.g. 3047466.helloworld_g). This is the gmon.out profiler data file.

Transferring a file

You can import .gmon, .kev, .ptrace, or .xml data files using the Import action from the session view, or using the Import wizard:

  1. Open the Target File System Navigator view (Window-->Show View-->Other...-->QNX Targets-->Target File System Navigator).
  2. In the Target File System Navigator view, right-click your file and select Copy to...-->Workspace. The Select target folder dialog appears.
  3. Select the project related to your program.
  4. Click OK.
  5. In the C/C++ Projects view, right-click your file and select Import into QNX Application Profiler. The Program Selection dialog appears.
  6. Select the binary that generated the file.
  7. Click OK. You can now profile your program in the QNX Application Profiler perspective.

Postmortem profiling for Function Instrumentation

To create a .ptrace file, run your application with the option QPROF_FILE=/tmp/app.ptrace. For example, to launch from the command line, type:

  QPROF_FILE=/tmp/app.ptrace ./appname

To launch from the IDE:

Application Profiler tab

The descriptions for the launch options for the Application Profiler tab are:

Functions Instrumentation
Capture detailed information about function behavior in the runtime. When selected, the profiling method is considered instrumented (function instrumented).
Sampling and Call Count Instrumentation
Provide statistical information based on probes driven by the timer interrupt.
Single Application
Profile a single process for a specific period of time; however, information about the context switches is not available.
System Wide
Generate profiling events as kernel log events so that later you can use the System Profiler tool to navigate the data. This means that the IDE doesn't monitor a specific program; it monitors all the processes that execute on a specific set of CPUs. Selecting this option generates only a few seconds' worth of data because of the large amount of data captured within that period of time. In order to capture kernel log events, you must enable System Profiling at the same time. To enable System Profiling, from the Tools tab for your launch configuration, Click Add/Delete Tool..., select the Kernel Logging tool, and then click OK.
Save on the target, then upload
Save the data by transferring it to the target machine, and then uploading the results.
Upload while running
Transfer the data while the process is currently running.
Path on the target for profiler trace
Define the location on the target machine of the profiler trace results file. The string ${random} would be substituted by a random number; this substitution runs for several sessions simultaneously.
Remove on Exit
Remove the resulting profiler trace file from target after the session ends.
Use Pipe
Create a pipe file on the target machine instead of a regular trace file. To use this option, the pipe daemon must be running on the target machine, and the file can only be created on the real filesystem (i.e. not /dev/shmem).
Install start/stop hooks
In function instrumentation mode, install signal handler to support profiler start/stop.
Automatically start profiling
When disabled, profiling won't start until profiling is explicitly started user intervention.
Pause signal number
Signal pauses the profiling data capture process.
Resume signal number
Signal resumes profiling data capturing.

Controlling your profiling sessions

The Profiler Sessions view (Window-->Show View-->Other...-->QNX Application Profiler-->Profiler Sessions) lets you control multiple profiling sessions simultaneously. You can:


Profiler view


The Profiler Sessions view.

From the Debug tab, you can see more detail about the session:


Debug view


The Debug tab for profile sessions.

The Profiler Sessions view shows the following as a hierarchical tree for each profiling session:

Type Description
Session ID A consecutive identifier assigned to each profiler session.
Session Name Launch instance name (i.e. ApplicationProfiling).
Session State The current state of the session (open, closed)
Session Timestamp The date and time the session was created.

The icons that appear in the Profiler Sessions view are:

Name Icon
Running Process Icon: Running process
Executable Icon: Executable
Shared libraries Icon: Library
DLLs Icon: DLL
Unknown Icon: Unknown

Note:

A node named Unknown refers to a container for code that doesn't belong to any binary or library. Usually, this type refers to kernel code mapped to process virtual memory.

For Sampling and Call Count profiling, not all shared libraries or the binary appear in the tree view. The view can include only those libraries and binaries that were instrumented with Call Count instrumentation, or those that have corresponding samples during the execution. If the application runs for a short period of time (less than ten seconds), a library might not even have a single probe.

For Function Instrumentation, profiling only an instrumented binary and libraries would display in the tree view. System libraries, such as libc, would never appear in the view.


To choose which executable or library to show information for in the Execution Time view:
In the Profiler Sessions view, click one of the following:

To terminate an application running on a target:

  1. In the Debug view, select a launch configuration.
  2. Click the Terminate button (Icon: Terminate) in the title bar of the corresponding Console view.

    Note: To clear old launch listings from this view, click the Remove All Terminated Launches button (Icon: Remove All Terminated Launches).

To disconnect from an application running on a target:

  1. In the Debug view, select a running profiler session.
  2. Select a QNX Application Profiler Service.
  3. Click the Disconnect button (Icon: Disconnect) in the view's title bar.

    Note: To clear old launch listings from this view, click the Remove All Terminated Launches button (Icon: Remove All Terminated Launches).

Understanding your profiling data

Other views within the QNX Application Profiler perspective show the profiling information for each item you select in the Profiler Sessions view.

This view: Shows:
Profiler Sessions Application Profiler sessions
Execution Time Function Instrumentation or Call Count
Debug Target debugging in a Debug tree hierarchy view and the Application Profiler Debug view
Annotated source editor The amount of time your program spends on each line of code and in each function
Properties Session or item properties

After you profile

After gathering the profiling data, you can change to the Application Profiler perspective, and begin to analyze the data. In the Execution Time view, after profiling a project, the results show as precise function execution time, and a runtime call graph for Function Instrumentation. The results show the time for each function when Call Count profiling is enabled.

Profiler Sessions view

The Profiler Sessions view contains the sessions for the profiler instances. The other views within the QNX Application Profiler perspective are updated to show the profiling information for each item that you select from this Profiler Sessions view.

Toolbar options

Icon Name Go to
Icon: Pause profiling Resume Profiling Pausing and resuming a profiling session
Icon: Pause profiling Pause Profiling Pausing and resuming a profiling session
Icon: Taking a snapshot of an Application Profiler session Take Snapshot of the running session Taking a snapshot of a profile session
Icon: Creating a sample Application Profiler session Create a Sample Session Creating a sample profile session
Icon: Exporting an Application Profiler session Export Application Profiler Session Exporting a profiler session
Icon: Importing an Application Profiler session Import Application Profiler Session Creating a profiler session by importing profiler data

Pausing and resuming a profiling session

Occasionally, having too much data is the same as having no data at all. You can take control of when to enable profiling during the execution of an application using the Pause and Resume icons in the toolbar.

Taking a snapshot of a profile session

This feature lets you freeze the current state of the Application Profiler data while the actual session data keeps changing. The snapshot data remains frozen and can later be compared with the final results, or other snapshots of the same session. However, in the Execution Time view, this action also automatically switches to a comparison mode to dynamically show the updated difference between the current state and the snapshot.

To take a snapshot:
In the Profiler Sessions view, select a running profile and click the icon from the toolbar Take Snapshot of the running session.

Creating a sample profile session

A sample profile session will provide you with sample data to quickly evaluate features of the application profiler.

To create a sample profile session:
In the Profiler Sessions view, select a running profile and click the icon Create Sample session from the toolbar.

Exporting a profiler session

In the IDE, you can export your profile data information from the Profile Sessions view. When exporting your profiling analysis information, the IDE lets you export the results in the format you specified during export.

To export a profiler session:

  1. In the Profiler Session view, select a profiler session and right-click.
  2. Select Export.

    Exporting Application Profiler session data

  3. Select the session(s) that you want to export.
  4. In the Output File field, specify the name and location for the output file.
  5. In the Output area, select the output type: .csv or .xml.
  6. Click Finish.

Later, you can import data (see Creating a profiler session by importing profiler data), or you can choose to import other session data into System Profiler to review the results (see Using the results from Function Instrumentation mode in the System Profiler).

Debug view

The Debug view shows the target debugging information in a tree hierarchy.


Debug view


The Debug view.

The number displayed after a thread label is a reference counter, not a thread identification number (TID).

The IDE shows stack frames as child elements, and it shows the reason for the suspension beside the thread, (such as the end of the stepping range, a breakpoint was encountered, or a signal was received). When a program exits, the IDE also shows the exit code.

Execution Time view

This view provides you with valuable decision-making capabilities in that it helps you identify those functions that clearly consume the most CPU time, making them candidates for optimization. This type of instrumentation is the most effective way of optimizing bottlenecks in a single application. This data-collection technique lets you gather precise information about the duration of time that the processor spends in each function, and provides stack trace and Call Count information at the same time.


Execution Time view


The Execution Time view in the Application Profiler perspective.

Using a call tree, you can see exactly where the application spends its time, and which functions are used in the process.

By default, the selected preferences provide you with the basic columns containing valuable profiling data; however, you can specify additional columns and display settings (see "Setting preferences"), if desired.

The Execution time view supports the following tree views and graph:

Column descriptions

Name
The name of the function. In addition, you can view who called the function, and how much time each function took to execute in the context of a caller.
Deep Time
The time it took to execute the function and all of its descendants. It is the pure real time interval from the time function starts until it ends, which includes the shallow time of this function, the sum of the children's deep times, and all time in which the thread isn't running while blocked in this function. For sampling mode, it's not used. It's also referred as the Total Function Time. When this function is called more than once, it's the sum of all the times it's called from a particular stack frame, or from a particular function.
Shallow Time
For Function Instrumentation mode, it's the deep function time minus the sum of total for its children's calculated times. It roughly represents the time that the processor spent in a particular function only; however, for this type of analysis, it also includes the time for kernel calls, the time for instrumented library calls, and the time for profiling the code. For Sampling mode, it's an estimated time, calculated by multiplying an interval time for the count of all samples with a given function.
Count
The number of times the function was called.
Location
The location in the code where the function can be found.
Percent
The percentage of Deep Time compared to the Total Time (or compared to the Root node time).
Average
The average time spent in the function.
Max
The maximum time spent in the function.
Min
The minimum time spent in the function.
Time Stamp
A time stamp assigned to the function, if any (the last time the function was called).
Binary
The file name for the binary.

Interpreting Tree mode column information by profiling type

The following table describes the meanings for time columns for all data source combinations with visual modes:

Mode Node Time Own Time Count Average Max (Min)
Sampling and/or Call Count Function (All) Same as Own Time, invisible The sum of all problems for a given function The sum of Count for all Call Samples where given function is "to" Own Time / Count, or Own Time if count is 0 N/A
Sampling and/or Call Count Addressable (All) Same as Own Time, invisible The sum of all problems for a given address, or 0 if no problems for a given address (but exists in the Call Counts tree) The sum of Count for all Call Samples where given function is "to" Own Time / Count, or Own Time if count is 0 N/A
Sampling and/or Call Count Line Probe (Call Tree mode) Same as Own Time, invisible The sum of all problems for a given address 0 Same as Own N/A
Sampling and/or Call Count Call Pair (Call Tree mode, Reverse Call Tree mode) N/A N/A The sum of Call Counts a for given pair N/A N/A
Sampling and/or Call Count, Function Instr. Group Node (Reverse Call Tree Mode, Table Mode) Same as Own Time The sum of Own Time for the children The sum of Count for the children Time / Count Max (Min) of children
Function Instr. Function (All) The sum of the Total Function Time for each occurrence of this function in a timed call tree, excluding inner recursive frames The sum of the Own Function Time for all occurrences of this function in a call tree. The Own Function Time for the call tree is the Total Function Time minus the sum of the Total Function Time for all descendants. The sum of all counts to this function in the call tree (Time + Rec. Time) / Count The Max (Min) of the Total Function Time between all occurrences
Function Instr. Thread (Call Tree mode) The sum of the total for entry functions (only one entry, but there might be some unattached calls) Same as Total 1 N/A N/A
Function Instr. Call Pair (Call Tree mode) The sum of the Total Function Time for all occurrences of this call pair for a given parent backtrace N/A Call Count of this call pair for a given parent backtrace Time / Count Max (Min) of this call pair's Total Time for a given parent backtrace
Function Instr. Self (Call Tree mode) Same as Own The parent Total minus the sum of the Total for the siblings Count of a parent Own Time / Count Max (Min) of this call pair's Own Time for a given parent backtrace
Function Instr. Recursive Call Pair (Reverse Call Tree mode) N/A N/A The sum of Call Counts for a given pair N/A N/A
Function Instr. Call Pair, Thread, Process (Reverse Call Tree mode) The sum of Total Call Pair time for the Root function for a given stackframe (the child in this tree represents the parent in the call stack) N/A The sum of Call Counts for the Root function for a given stackframe Time / Count N/A

Toolbar options

Icon Name Description
Icon: Scroll Lock Scroll Lock Pauses the current view of the data to show the results to you in a frozen state until you unlock the window.
Icon: Refresh Refresh Updates the current view to show the most recent profiling information.
Icon: Take Snapshot and Watch Difference Take Snapshot and Watch Difference Take Snapshot and Watch Difference
Icon: Go Back Go Back Moves up one level in the tree view hierarchy.
Icon: Go Forward Go Forward Moves down one level in the Tree view hierarchy.
Icon: Show Threads Tree Show Threads Tree Show Threads Tree
Icon: Show Table Show Table Show Table mode
Icon: Menu Menu Shows the menu of options for this window.

Take Snapshot and Watch Difference

Use the Take Snapshot and Watch Difference icon to create another profiler session that's a snapshot of your program. Later, you can use the Compare feature to compare the profile session data, and then continue to monitor the results as your application runs in another pane.

To access this feature:
From the toolbar menu in the Execution Time view, click the Take Snapshot and Watch Difference icon.

Show Threads Tree

The Show Threads Tree option lets you show a graphical representation of the threads and calling functions within your application. You can drill down to see the detail of the lowest function calls.

To access this tree:
From the toolbar menu in the Execution Time view, click the Show Threads Tree icon.


Show Threads Tree


The Show Threads Tree view.

You can use this information to:

To view quantitative profiling values:
In the annotated source editor, let the pointer hover over a colored bar. The CPU usage appears, and shows as percentage and time values.

Show Table mode

This mode shows a list of functions from the applications in your project.


Note: In Function Instrumentation mode, it doesn't show calls to functions, such as printf(), in the C library.

To access this table:
From the toolbar menu in the Execution Time view, click the Show Table icon.

Icon: Show Table

A list of functions for the selected profile is displayed in the Execution Time view.


Show Table mode


The Show Table Mode view.

From this table, select a function a right-click to Show Calls, Show Reverse Calls, Show Call Graphs, or Show Source.

Show Calls

The Call Tree mode shows you a list of all of the functions called by the selected function. This call tree view lets you drill into specific call traces to analyze which ones have the greatest performance impact. You can set the starting point of the call tree view by drilling down from a thread entry function to see how the actual time is distributed for each of its function descendants.

To show a table containing a list of functions and its descendants for the selected profile:
In the Execution Time view, right-click on a function and select Show Calls from the menu.

Column Descriptions

Name
The name of the group or function, or self name and decorator, if applicable.
Time
The duration of time that the thread spends from the moment it enters, until it exits, the function (the sum for all occurrences, by context). The Time column can contain time bar and percent values.
Count
The number of time function calls.
Own Time
The time spent in the function, excluding only descendants.
Average
The Time column divided by the Calls Count column.
Own Average
The Own Time results divided by the Count.
Min
The minimum time.
Max
The maximum time.
Timestamp
The timestamp of the last entry to the function.
Location
The file or line location for the function.
Percent
The value of the result of: (Time/Current Total Time*100)

Time columns contain the following features, which you can customize using the Preferences menu option:

Time %
The value of Root Ratio for Time based columns, and the value of Total Ratio for the Own Time based columns.
Timebar
A visual bar occupying a percentage of the column equal to the total amount of time that a thread spends in a function.

Additional columns:

Own Total Ratio
The value of the result of: (Own Time/Total App. Time*100)
Parent Ratio
The percentage of time for a child node compared to the parent node; not the total time.
Root Ratio
The value of the result of: Time/Root Time*100
Binary
The name of the binary container.

Show Reverse Calls

A reverse call tree shows you what is calling a specific function, and how its time was distributed for each of those callers. You can use a reverse call tree to either drill up or down the stack to view the callers and their contribution time, until you encounter a thread entry function.

To show the source code for a function:
In the Execution Time view, right-click on a function and select Show Reverse Calls from the menu.

Show Call Graphs

A call graph shows a visual representation of how the functions are called within the project.

To create a call graph for the selected profile:
In the Execution Time view, right-click on a function and select Show Call Graphs from the menu.


Simple call graph example


A simple example of a call graph.

This call graph shows a pictorial representation of the function calls. The selected function appears in the middle, in blue. On the left, in orange, are all of the functions that called this function. On the right, also in orange, are all of the functions that this function called.

To see the calls to and from a function:
Click on a function directly in the call graph.

Note:

You can show the call graph only for functions that were compiled with profiling enabled.

If you position your cursor over a function in the graph, you will see Deep Time, Percent, and Count information for that function, if any.

For descriptions about these fields, see Field descriptions.


Show Source

Occasionally, you'll want to view the source code for a particular function that might require further investigation. You can easily jump to the source code and compare the profiling results against the actual code to determine if the data is acceptable, or if it's a candidate for further optimization.

To show the source code for a function:
In the Execution Time view, right-click on a function and select Show Source from the menu.

Context menu navigation options

An easy to use context navigation menu is available for each node of the tree, table, or call graph. The options available from the context menu are:

Execution Time view features

The Execution Time view includes the following features:

Duplicating the view

You can create a second Execution Time view to see data side-by-side in another window using the menu option Duplicate View. The new view is disconnected from Profiler Sessions view; however, it maintains its own history. You can use this feature to observe a "snapshot" of your program, and then continue to monitor the results as your application runs in another pane.

To duplicate a view:
In the Execution Time view, click the Menu icon from the toolbar and select Duplicate View.

Viewing history

The Execution Time view keeps track and maintains a record of where have been. You can use the Go Back and Go Forward icons from the toolbar, or select a particular entry in the navigation history. You can set the navigation history size in the preferences for the view.

Grouping

The grouping feature helps for the organization of large function tables, and for improved navigation and analysis. This is the most efficient method to observe aggregated time results for each software component (binary or file).

To access data grouping:
In the Execution Time view, click the Menu icon from the toolbar and select Group By.


Grouping menu options


Menu options for grouping.

Setting preferences

You can use the Execution Time View Preference Page to customize the number of columns you want to have in the view, their order, and the format of the data they show in the view.

To set preferences:
In the Execution Time view, click the Menu icon from the toolbar and select Preferences.


Setting preferences


Setting user preferences.

For example, you might want to select more columns to add more detail information to your view:


Additional columns selected for the view


Additional columns selected for the view.

Copying to the clipboard

At any time, if you want to see the table or tree data in textual format, use your development host's method of copying to obtain the text version of the visible data, which will be copied to your clipboard.

Filtering

When grouping doesn't help reduce the amount of profiling data from the results, you can use filters to remove some rows from the table. Component filtering lets you see only those records related to the specified component, or you can use Data filtering to filter based on timing values.

When filtering is applied, the "<filtered>" element remains in the view as a remainder of the filtered elements, and the total number of these elements is visible in the Count column.

To filter results:
In the Execution Time view, click the Menu icon from the toolbar and select Filters.


Filtering dialog


The Filtering dialog.

Searching

You can perform a text search on the data results from the profile. The Find feature includes a Find bar at the bottom of the Execution Time view. The view automatically expands and highlights the nodes in the tree when the search locates results matching the search criteria.

To search results:
In the Execution Time view, click the Menu icon from the toolbar and select Search.

Annotated source editor

The annotated source editor lets you see the amount of time your program spends on each line of code and in each function.

To open the editor:

  1. Launch a profile session for a debuggable (i.e. _g) executable.
  2. In the Profiler Sessions view, select your program by selecting an Application Profiler instance (Icon: QNX Application Profiler tool) or an executable (Icon: Executable).
  3. In the Execution Time view, double-click a function that you have the source for. The IDE opens the corresponding source file in the annotated source editor:

    Annotated source editor


Note: You may receive incorrect profiling information if you change your source after compiling because the annotated source editor relies on the line information provided by the debuggable version of your code.

The annotated source editor shows a solid or graduated color bar graph on the left side, as well as providing a Tooltip with information about the total number of milliseconds for the function, the total percentage of time in this function, and for children, the percentage of time in the function as it relates to the parent.

The length of the bar represents the percentage. On the first line of the function declaration, that bar provides the total for all time spent in the function. The totals include:

The colors on the bars represent:

Green-Yellow
The amount of time for the inline sampling or call-pair data.
Blue-Yellow
The time it took to execute the function and all of its descendants. For the function, it includes the period from the time function starts until it ends, which includes the shallow time of this function, the sum of the children's deep times, and all time in which the thread isn't running while blocked in this function.

To view quantitative profiling values:
In the annotated source editor, let the pointer hover over a colored bar. The CPU usage appears, shown as percentage and time values.


Annotated source editor; hovering


The QNX annotated source editor.

Using the Application Profiler

If you want to profile an application, you can do the following:

Using Function Instrumentation with the Application Profiler

When you profile a project, you can choose Function Instrumentation to obtain detailed information about the functions within your application. Each function entry and exit is instrumented with a call. The purpose of this is to record the entry and exit time of each function and call sequence.

The profiling options available to you are:

Using Sampling and Call Count instrumentation mode

Sampling mode provides you with profiling information for your project at a specific time interval (the Application Profiler takes samples from processes at given rate). The information is recorded into a sample that you can use for comparison purposes.


Note:

When you use sampling mode to obtain only data, you'll notice the following:

  • To use basic sampling, you're not required to recompile your application.
  • This mode won't provide you with you precise function times, but you can use the data for comparison purposes.
  • The profile will run and gather sample data for a long period of time.

Launching from the IDE

To prepare your binary for Call Count instrumentation:

  1. Optional: Depending on your type of project, do one of the following to prepare your binary:
  2. Create a launch configuration for your application, add click the Tools tab.
  3. Select Application Profiler and click OK.
  4. From the Application Profiler tab, select Sampling and Call Count Instrumentation.
  5. Select the Single Application option.
  6. Select the Switch to this tool's perspective on launch checkbox.
  7. Run the configuration to begin the profiling process.

Now, your application is launched, as well as the Application Profiler tool. The Application Profiler perspective opens and the Execution Time view shows data from the current session; the view is automatically refreshed.

To customize your Execution Time view if you're running in this mode:

Using Function Instrumentation mode for a single application

This method lets you obtain precise function information at runtime. It performs best for one thread because when there is more than one thread, the overhead measurement from multiple threads can change the application's behavior.

To compile an application with Function Instrumentation:

  1. Depending on your type of project, do one of the following:
  2. To launch a profiling session:

Using Function Instrumentation in the System Profiler

By using the data from the Function Instrumentation mode in System Profiler, you can:


System Profiler: Application Profiler data


Application Profiler data in the System Profiler timeline.


Note: If you're missing function names in the System Profiler Timeline view, you may want to consider adding this information by instrumenting your binaries with the Function Instrumentation library, and running in Kernel Events mode. For additional information, see "Using Function Instrumentation mode for a single application."

Launching from the command line on the target machine

To launch from the command line:

  1. Set the environment variable to the following:

    	QPROF_KERNEL_TRACE=1
    	

     

    Set this environment variable for each process, or export it for all processes; it won't affect uninstrumented binaries.

  2. Launch one or more processes on the target.
  3. In the IDE, open the System Profiler perspective and run Kernel Logging for several seconds.

    Note: You can use tracelogger to capture events generated by programs compiled with Function Instrumentation.

  4. Open the resulting .kev file in System Profiler editor.
  5. Optional: You can import the .kev file into the Application Profiler perspective from the Profiler Sessions view (Import Application Profiler Session icon), or by using File-->Import to open the Import wizard.

Launching from the IDE

To profile a process:

  1. Create a launch configuration for the binary.
  2. On Tools tab, select Add/Delete Tools, then select Application Profiler.
  3. Select Kernel Logging.
  4. Click OK
  5. On the Application Profiler tab, select Functions Instrumentation.
  6. For the Project Scope, select System Wide.
  7. Disable the option Switch to this tool's perspective on launch if it's currently selected.
  8. Click Apply.
  9. Switch to Kernel Logging tab.
  10. Select Launch with Kernel Log capturing.
  11. Select one of existing System Profiler Kernel Log configurations. If you don't have any, click Edit and create one.
  12. Select the option Switch to this tool's perspective on launch.
  13. Click Apply.
  14. Click the Upload tab.
  15. Deselect Use unique name for the uploaded binary.
  16. Click Apply.
  17. Click Run.

Creating an Application Profiler session

When you create an Application Profiler session, you can profile an application to capture performance information after you've created your launch configuration.

Before you start:

To profile in this scenario, follow these steps:

  1. Prepare projects and launch configuration for Application Profiler to run:
    1. Enable binary instrumentation for profiling (see "Running with profiling enabled").
    2. Recompile the application.
  2. Launch the session (click either Run or Debug, depending on your launch configuration).
  3. The IDE changes focus to the Application Profiler perspective.

Now, the Application Profiler session is ready for you to use.

Creating a profiler session by importing profiler data

You can create a profiler session by importing .gmon, .kev, or .ptrace files using the Import action from the Profiler Sessions view.

Before you start, you must:

To profile in this scenario, follow these steps:

  1. Run the instrumented binary on the target with profiling enabled (see "Running with profiling enabled").
  2. Transfer the output file to the host machine.
  3. Open the Application Profiler perspective.
  4. In the Profiler Sessions view, perform an Import (see "Importing a file").

The IDE creates a new Application Profiler session and populates it with the imported data, as well as the Execution Time view. Now, your Application Profiler session is ready for inspection.

Profiling a single-threaded application

For this particular situation for example, you might have a single-threaded application that performs badly for a specific test case, and you want to understand the reason(s) why, and try to attempt to optimize it, if possible.

Before you start:

Before you start:

To profile the application, follow these steps:

  1. Create an Application Profiler session using the IDE launch configuration:
    1. Enable instrumentation for profiling for your project (see Building a program for profiling).
    2. Open your desired launch configuration.
    3. Click the Tools tab.
    4. Click Add/Delete Tool....
    5. Select Application Profiler and click OK.
    6. In the Application Profiler options, enable Function Instrumentation, and click Apply.
    7. Return to the Application Profiler tab in the Launch configuration dialog and click Run again. There will be no error message this time.

      The IDE changes to the Application Profiling perspective, populates the session view, and shows the Execution Time view, which dynamically changes.

  2. After the application terminates, inspect the Application Profiler results:
    1. In the Execution Time view, click the Menu icon and select Show Caller Tree.

      The active page shows the Tree containing the list of functions being called.

    2. Expand the root node and observe the functions it called with times, percentages, and call times.
    3. Continue expanding until you encounter any suspicious functions that consume the CPU time.

      Now, you can investigate why the certain functions consume the CPU time.

  3. Select the function and perform the Show Caller Tree action.
  4. View the changes to show the function that you want to investigate as the root, and its callers as children (Caller Tree mode).

    Now, you might notice that this function is called from other places as well; however, you need to investigate its total contributions versus the amount of CPU it consumes.

  5. Select another function from the list, right-click on the function and select Show Reverse Calls from the menu.
  6. View the changes to show this function as the root in the hierarchy, and its calling functions as children (Show Call Tree mode).
  7. Observe the number of times that this function is called, the percentage of CPU time it consumes, the number of times its child (children) is called, and the total time.
  8. Open the source code for the function to confirm any suspicions, and to perform any necessary edits to the code.

    Next, you can confirm your results by running another profiling session, and then using the Compare feature to compare the results.

    1. Run the launch configuration again.
    2. Wait until the application terminates.
    3. In the Profile Sessions view, right-click on a session and select Compare.

      The IDE opens a view where you can see the total time compared to the other session time with the percentage of improvements (a green arrow pointing downward).

  9. Return to your normal development cycle by disabling the Application Profiler tool in the launch configuration.

    Note: There's no need to change your compile options.

Profiling a running process for an existing project

You can profile an application to capture performance information for an existing project.

Before you start:

The process must be running on the target with profiling enabled.

To profile a process from an existing QNX C/C++ project that's already running on your target:

  1. While the application is running, open the Launch Configurations dialog by choosing Run-->Profile... from the menu.
  2. Select C/C++ QNX Attach to Remote Process via QConn (IP) from the list on the left.
  3. Click the New button to create a new attach-to-process configuration.
  4. Configure things as you normally would for launching the application with debugging.
  5. On the Tools tab, click Add/Delete Tool.... The Tools Selection dialog is shown.
  6. Select the Application Profiler tool, then click OK. The Application Profiler tab is displayed on the launcher.
  7. Select Switch to this tool's perspective on launch.
  8. Click Apply, and then click Debug. The Select Process dialog shows all of the currently running processes.
  9. Select the running process you want to profile, then click OK. Now, you can begin to analyze the profiler data.

Using postmortem profiling for Call Count and Sampling

You can change the configuration options to profile an application to capture performance information whereby profiling is done by code linked into the process, and after the process exits normally (without error). Data, which is the function information (such as call counts, callers, and statistics), is written to a file that you can then load into the IDE.

To configure postmortem profiling:

  1. In the C/C++ Projects view, right-click your project and select Properties.
  2. In the left pane, select QNX C/C++ project.
  3. In the right pane, select the Options tab.
  4. Select Build for Profiling (Call Count).
  5. Select the Build Variants tab and select the Debug variant for your target(s).
  6. Click OK.
  7. When prompted, click Yes to rebuild your project.
  8. Create a launch configuration for a debuggable executable.
  9. Select the Environment tab.

    Profiling information is written to a file in the location you specify with the PROFDIR environment variable. If you don't set PROFDIR, the information is written to a file called gmon.out in the directory the process was run from.

  10. In the Name field, type PROFDIR.
  11. In the Value field. Type a valid path to a directory on your target machine, (i.e. /tmp).
  12. Click OK.
  13. Run the program.
  14. When the execution finishes, import a data file, such as gmon.out, by doing the following:
    1. Select Window-->Show View-->Other-->QNX Targets-->Target File System Navigator.
    2. In the Select target folder dialog, select the project related to your program.
    3. Click OK.
  15. In the C/C++ Projects View, right-click the imported file and rename it, i.e. to gmon.out.
  16. To start a postmortem profiling session, do the following:
    1. In the C/C++ Projects View, right-click on the file gmon.out and select the Import/Open action in the QNX Application Profiler.
    2. In the Import from gmon.out file window, browse to set the location of the executable file.
    3. Click Finish.

Now, you can begin to analyze the profiler data.

Postmortem profiling

When it's not possible to run an application from the IDE, but it's possible to re-compile application, run it on a target and transfer results back to host machine, you can use the results of postmortem profiling to transfer the results using the Import wizard.

To profile the application, follow these steps:

  1. Enable binary instrumentation for profiling (see "Building a program for profiling").
  2. Recompile the application and transfer the binary to a target machine.

    Next, create a profiler session by importing profiler data. Ensure that you compile the binary with instrumentation enabled.

  3. Run the instrumented binary on the target with data collection enabled.
  4. Transfer the output file to the host machine.
  5. Open the Application Profiler perspective.
  6. In the Profiler Sessions view, click the Import Application Profiler Session icon to import the data:

    Icon: Importing an Application Profiler session

    The Application Profiler Import wizard opens.

    Importing Application Profiler data

  7. Select a file to import, and then click Next.
  8. Select the name of a session that you want to import.
  9. Click Finish.

    The IDE creates a new Application Profiler session and populates it with the imported data, as well as populating the Execution Time view with data.

Application Profiling Session is ready to use.

Running an instrumented binary with profiling from a command prompt (Function Instrumentation mode)

To run an instrumented binary with profiling from the command prompt:

  1. To start the Application Profiler immediately after the application starts, set environment variable QPROF_START:

    	QPROF_START=1
    	
  2. To redirect the gmon output to a file, set the environment variable: QPROF_FILE

    	QPROF_FILE=/tmp/myapp.ptrace
    	
  3. To change to kernel trace logging, set the environment variable QPROF_KERNEL_TRACE=1:
  4. To include the shared library path used for profiling, set the environment variable LD_LIBRARY_PATH:

    	LD_LIBRARY_PATH=.../profiling_lib:$LD_LIBRARY_PATH
    	
  5. To run the application, set the following:

    	QPROF_START=1 QPROF_FILE=/tmp/myapp.ptrace \
    	LD_LIBRARY_PATH=.../profiling_lib:$LD_LIBRARY_PATH  ./myapp
    	

Taking a snapshot of a profiling session

A snapshot of a profiling session provides you with a record of the current state of the session data from the moment you select the capture option. You can then use the snapshot to look for differences in CPU time between the time of the snapshot and the running time of the profiling session that followed.

To take a snapshot of a profiling session, follow these steps:

  1. Prepare projects and launch the configuration for an Application Profiler run. For information, see "Creating an Application Profiler session."
  2. Launch the application.
  3. In the Execution Time view, while the program is being profiled, click the Take Snapshot and Watch Difference button.

The snapshot capture freezes the current state of the Application Profiler data; meanwhile the actual profile session data keeps changing. Now, you can begin to analyze the profiler data to compare the snapshot data against the changing data.

Comparing profiles

When you complete optimizing, it's useful to see what progress you've made. The comparison mode lets you easily see the difference between two profile sessions. You can continue to view data as a Call Tree or a Table, but instead of absolute time values, you see time differences.

For example, you can compare two profiles to evaluate results before and after function optimization. In Compare mode, each column shows the change in values compared to the other session. Time and Count columns show the new value minus the old value. If there's no new value match for an item, its old value is used. If no old value match exists, the item will have a "+" indicator beside the new value.


Comparing two profiler sessions


Comparing two profiler sessions.

In this case, you must have at least two Application Profiler sessions to compare.

To profile in this case, follow these steps:

  1. In the Profiler Sessions view, select the two sessions that you want to compare.
  2. Right-click to open the context menu and select Compare menu time.

    View the changes based on the results of the Comparison mode.

  3. The IDE shows colored arrows to indicate the old and new results for the selected sessions.
  4. Optional: You can use filters to remove insignificant results (<1% of difference), using Filter By:

    Filtering results

    1. From the Execution Time view toolbar menu, select Filters to open the Filter dialog.
    2. Specify any filtering criteria.
    3. Click OK.

After you profile

The Execution Time view shows the difference between two selected sessions, and you can observe these differences by:


Note: In the Profiler Sessions view, you can use the Take Snapshot feature to freeze the current state of the Application Profiler data while the actual session data keeps changing. The snapshot data remains frozen and can later be compared with the final results, or other snapshots of the same session. In the Execution Time view, this action also automatically switches to view a Comparison mode to dynamically show the updated difference between the current state and the snapshot.