Events and the Kernel

This chapter includes:

What generates events

The QNX Neutrino microkernel generates events for more than just system calls. The following are some of the activities that generate events:

In addition, the instrumented kernel also inserts “artificial” events for:

Also, single kernel calls or system activities may actually generate more than one event.

Generating events: a typical scenario

Processes that are running on QNX Neutrino can run multiple threads. Having more than one thread increases the level of complexity—the OS must handle threads of differing priorities competing with each other.

Multithreaded example

In our example we'll use two threads:

Thread Priority
A High
B Low

Now we'll watch them run, assuming both start at the same time:


Note: When logging starts, the instrumented kernel sends information about each thread. Existing processes will appear to be created during this procedure.

Time Thread Action Explanation
t1 A Create Thread is created.
t2 A Block The thread is waiting for, say, I/O; it can't continue without it.
t3 B Create Rather than sit idle, the kernel runs next highest priority thread.
t4 B Kernel Call Thread B is working.
t4.5 N/A N/A I/O completed; Thread A is ready to run.
t5 B Block Thread A is now ready to run—it preempts thread B.
t6 A Run Thread A resumes.
t7 A Dies Its task complete, the thread terminates.
t8 B Runs Thread B continues from where it left off.
t9 ... ... ...

Thread context-switch time

Threads don't switch instantaneously—after one thread blocks or yields to another, the kernel must save the settings before running another thread. The time to save this state and restore another is known as thread context-switch time. This context-switch time between threads is small, but important.


Timing


Thread context switching.

In some cases, two or more threads may switch back and forth without actually accomplishing much. This is akin to two overly polite people each offering to let the other pass through a narrow door first— neither of them gets to where they're going on time (two aggressive people encounter a similar problem). This type of problem is exactly what the SAT can quickly and easily highlight. By showing the context-switch operations in conjunction with thread state transitions, you can quickly see why otherwise fast systems seem to “crawl.”

Restarting threads

In order to achieve maximum responsiveness, much of the QNX Neutrino microkernel is fully preemptible. In some cases, this means that when a thread is interrupted in a kernel call, it won't be able to restart exactly where it began. Instead, the kernel call will be restarted—it “rewinds” itself. The SAT tries to hide the spurious calls but may not succeed in suppressing them all. As a result, it's possible to see several events generated from a specific thread that has been preempted. If this occurs, the last event is the actual one.

Simple and combine events

Most events can be described in a single event buffer slot; we call these simple events. When there's too much information to describe the event in a single buffer slot, the event is described in multiple event buffer slots; we call this a combine event. The event buffer slots all look the same, so there's no need for the data-capture program to distinguish between them.

For more information about simple events and combine events, see the Interpreting Trace Data chapter.

Fast and wide modes

You can gather data for events in the following modes:

Wide mode
The instrumented kernel uses as many buffer slots as are necessary to fully log the event. The amount of space is theoretically unlimited and can span several kilobytes for a single event. Most of the time, it doesn't exceed four 16-byte spaces.
Fast mode
The instrumented kernel uses only one buffer slot per event.

In general, wide mode generates several times more data than fast mode.


Note: Fast mode doesn't simply clip the tail end of the event data that you'd get in wide mode; fast mode summarizes the most important aspects of the event in a single buffer slot. Thus, the first element of an event in wide mode might not be the same as the same event in fast mode.

You can set fast and wide mode for all classes, specific classes, and even specific events in a class; some can be fast while others are wide. We'll describe how to set this in the Capturing Trace Data chapter.

For the specific output differences between fast and wide mode, see the Current Trace Events and Data appendix.

Classes and events

There can be a lot of events in even a small trace, so they're organized into classes to make them easier for you to manage:

(The <sys/trace.h> header file also defines an _NTO_TRACE_EMPTY class, but it's a placeholder and isn't currently used.)

The sections that follow list the events for each class, along with a description of when the events are emitted, as well as the labels that traceprinter and the IDE use to identify the events.

For information about the data for each event, see the Current Trace Events and Data appendix.

Communication class: _NTO_TRACE_COMM

The _NTO_TRACE_COMM class includes events related to communication:

Event traceprinter label IDE label Emitted when:
_NTO_TRACE_COMM_ERROR MSG_ERROR Error A client is unblocked because of a call to MsgError()
_NTO_TRACE_COMM_REPLY REPLY_MESSAGE Reply A reply is sent
_NTO_TRACE_COMM_RMSG REC_MESSAGE Receive Message A message is received
_NTO_TRACE_COMM_RPULSE REC_PULSE Receive Pulse A pulse is received
_NTO_TRACE_COMM_SIGNAL SIGNAL Signal A signal is received
_NTO_TRACE_COMM_SMSG SND_MESSAGE Send Message A message is sent
_NTO_TRACE_COMM_SPULSE SND_PULSE Send Pulse A pulse is sent
_NTO_TRACE_COMM_SPULSE_DEA SND_PULSE_DEA Death Pulse A _PULSE_CODE_COIDDEATH pulse is sent
_NTO_TRACE_COMM_SPULSE_DIS SND_PULSE_DIS Disconnect Pulse A _PULSE_CODE_DISCONNECT pulse is sent
_NTO_TRACE_COMM_SPULSE_EXE SND_PULSE_EXE Sigevent Pulse A SIGEV_PULSE is sent
_NTO_TRACE_COMM_SPULSE_QUN SND_PULSE_QUN QNet Unblock Pulse A _PULSE_CODE_NET_UNBLOCK pulse is sent
_NTO_TRACE_COMM_SPULSE_UN SND_PULSE_UN Unblock Pulse A _PULSE_CODE_UNBLOCK pulse is sent

Control class: _NTO_TRACE_CONTROL

The _NTO_TRACE_CONTROL class includes events related to the control of tracing itself:

Event traceprinter label IDE label Emitted when:
_NTO_TRACE_CONTROLBUFFER BUFFER Buffer The instrumented kernel starts filling a new buffer
_NTO_TRACE_CONTROLTIME TIME Time The 32 Least Significant Bits (LSB) part of the 64-bit clock rolls over, or the kernel emits an _NTO_TRACE_CONTROLBUFFER event

The purpose of emitting _NTO_TRACE_CONTROLBUFFER events is to help tracelogger and the IDE track the buffers and determine if any buffers have been dropped. The instrumented kernel emits an _NTO_TRACE_CONTROLTIME event at the same time to keep the IDE in sync (in case a dropped buffer contained an _NTO_TRACE_CONTROLTIME event for a rollover of the clock).

Interrupt classes: _NTO_TRACE_INTENTER, _NTO_TRACE_INTEXIT, _NTO_TRACE_INT_HANDLER_ENTER, and _NTO_TRACE_INT_HANDLER_EXIT

These classes track interrupts:

Class traceprinter label IDE label Emitted when:
_NTO_TRACE_INTENTER INT_ENTR Entry Overall processing of an interrupt begins
_NTO_TRACE_INTEXIT INT_EXIT Exit Overall processing of an interrupt ends
_NTO_TRACE_INT_HANDLER_ENTER INT_HANDLER_ENTR Handler Entry Entering an interrupt handler
_NTO_TRACE_INT_HANDLER_EXIT INT_HANDLER_EXIT Handler Exit Exiting an interrupt handler

The expected sequence is:

INTR_ENTER
    INTR_HANDLER_ENTER
    INTR_HANDLER_EXIT
    INTR_HANDLER_ENTER
    INTR_HANDLER_EXIT
INT_EXIT

_NTO_TRACE_INT is a pseudo-class that comprises all of the interrupt classes.

The “event” is an interrupt vector number, in the range from _NTO_TRACE_INTFIRST through _NTO_TRACE_INTLAST.

Kernel-call classes: _NTO_TRACE_KERCALLENTER, _NTO_TRACE_KERCALLEXIT, and _NTO_TRACE_KERCALLINT

These classes track kernel calls:

_NTO_TRACE_KERCALL is a pseudo-class that comprises all these classes.

The traceprinter labels for these classes are KER_CALL, KER_EXIT, and INT_CALL, followed by an uppercase version of the kernel call; the IDE labels consist of the kernel call, followed by Enter, Exit, or INT.

Most of the events in these classes correspond in a fairly obvious way to the kernel calls; some correspond to internal functions:

Event Kernel call
__KER_BAD N/A
__KER_CHANCON_ATTR ChannelConnectAttr()
__KER_CHANNEL_CREATE ChannelCreate()
__KER_CHANNEL_DESTROY ChannelDestroy()
__KER_CLOCK_ADJUST ClockAdjust()
__KER_CLOCK_ID ClockId()
__KER_CLOCK_PERIOD ClockPeriod()
__KER_CLOCK_TIME ClockTime()
__KER_CONNECT_ATTACH ConnectAttach()
__KER_CONNECT_CLIENT_INFO ConnectClientInfo()
__KER_CONNECT_DETACH ConnectDetach()
__KER_CONNECT_FLAGS ConnectFlags()
__KER_CONNECT_SERVER_INFO ConnectServerInfo()
__KER_INTERRUPT_ATTACH InterruptAttach()
__KER_INTERRUPT_DETACH InterruptDetach()
__KER_INTERRUPT_DETACH_FUNC N/A
__KER_INTERRUPT_MASK InterruptMask()
__KER_INTERRUPT_UNMASK InterruptUnmask()
__KER_INTERRUPT_WAIT InterruptWait()
__KER_MSG_CURRENT MsgCurrent()
__KER_MSG_DELIVER_EVENT MsgDeliverEvent()
__KER_MSG_ERROR MsgError()
__KER_MSG_INFO MsgInfo()
__KER_MSG_KEYDATA MsgKeyData()
__KER_MSG_READV MsgRead(), MsgReadv()
__KER_MSG_RECEIVEPULSEV MsgReceivePulse(), MsgReceivePulsev()
__KER_MSG_RECEIVEV MsgReceive(), MsgReceivev()
__KER_MSG_REPLYV MsgReply(), MsgReplyv()
__KER_MSG_SENDV MsgSend(), MsgSendv(), and MsgSendvs()
__KER_MSG_SENDVNC MsgSendnc(), MsgSendvnc(), and MsgSendvsnc()
__KER_MSG_SEND_PULSE MsgSendPulse()
__KER_MSG_VERIFY_EVENT MsgVerifyEvent()
__KER_MSG_WRITEV MsgWrite(), MsgWritev()
__KER_NET_CRED NetCred()
__KER_NET_INFOSCOID NetInfoScoid()
__KER_NET_SIGNAL_KILL NetSignalKill()
__KER_NET_UNBLOCK NetUnblock()
__KER_NET_VTID NetVtid()
__KER_NOP N/A; forces a thread into the kernel so that scheduling can take place
__KER_RING0 (not generated in QNX Neutrino 6.3.0 or later) __Ring0()
__KER_SCHED_CTL SchedCtl()
__KER_SCHED_GET SchedGet()
__KER_SCHED_INFO SchedInfo()
__KER_SCHED_SET SchedSet()
__KER_SCHED_YIELD SchedYield()
__KER_SIGNAL_ACTION SignalAction()
__KER_SIGNAL_FAULT N/A
__KER_SIGNAL_KILL SignalKill()
__KER_SIGNAL_PROCMASK SignalProcmask()
__KER_SIGNAL_RETURN SignalReturn()
__KER_SIGNAL_SUSPEND SignalSuspend()
__KER_SIGNAL_WAITINFO SignalWaitInfo()
__KER_SYNC_CONDVAR_SIGNAL SyncCondvarSignal()
__KER_SYNC_CONDVAR_WAIT SyncCondvarWait()
__KER_SYNC_CREATE SyncCreate(), SyncTypeCreate()
__KER_SYNC_CTL SyncCtl()
__KER_SYNC_DESTROY SyncDestroy()
__KER_SYNC_MUTEX_LOCK SyncMutexLock()
__KER_SYNC_MUTEX_REVIVE SyncMutexRevive()
__KER_SYNC_MUTEX_UNLOCK SyncMutexUnlock()
__KER_SYNC_SEM_POST SyncSemPost()
__KER_SYNC_SEM_WAIT SyncSemWait()
__KER_SYS_CPUPAGE_GET N/A
__KER_SYS_CPUPAGE_SET N/A
__KER_THREAD_CANCEL ThreadCancel()
__KER_THREAD_CREATE ThreadCreate()
__KER_THREAD_CTL ThreadCtl()
__KER_THREAD_DESTROY ThreadDestroy()
__KER_THREAD_DESTROYALL N/A
__KER_THREAD_DETACH ThreadDetach()
__KER_THREAD_JOIN ThreadJoin()
__KER_TIMER_ALARM TimerAlarm()
__KER_TIMER_CREATE TimerCreate()
__KER_TIMER_DESTROY TimerDestroy()
__KER_TIMER_INFO TimerInfo()
__KER_TIMER_SETTIME TimerSettime()
__KER_TIMER_TIMEOUT TimerTimeout()
__KER_TRACE_EVENT TraceEvent()

Process class: _NTO_TRACE_PROCESS

The _NTO_TRACE_PROCESS class includes events related to the creation and destruction of processes:

Event traceprinter label IDE label Emitted when:
_NTO_TRACE_PROCCREATE PROCCREATE Create Process A process is created
_NTO_TRACE_PROCCREATE_NAME PROCCREATE_NAME Create Process Name A newly created process is given a name.
_NTO_TRACE_PROCDESTROY PROCDESTROY Destroy Process A process is destroyed
_NTO_TRACE_PROCDESTROY_NAME (Not currently used)
_NTO_TRACE_PROCTHREAD_NAME PROCTHREAD_NAME Thread Name A name is assigned to a thread

System class: _NTO_TRACE_SYSTEM

The _NTO_TRACE_SYSTEM class includes events related to the system as a whole:

Event traceprinter label IDE label Emitted when:
_NTO_TRACE_SYS_ADDRESS ADDRESS Address A breakpoint is hit
_NTO_TRACE_SYS_APS_BNKR APS_BANKRUPTCY APS Bankruptcy An adaptive partition exceeded its critical budget
_NTO_TRACE_SYS_APS_BUDGETS APS_NEW_BUDGET APS Budgets SchedCtl() is called with a command of SCHED_APS_CREATE_PARTITION or SCHED_APS_MODIFY_PARTITION. Also emitted automatically when the adaptive partitioning scheduler clears a critical budget as part of handling a bankruptcy.
_NTO_TRACE_SYS_APS_NAME APS_NAME APS Name SchedCtl() is called with a command of SCHED_APS_CREATE_PARTITION
_NTO_TRACE_SYS_COMPACTION COMPACTION Compaction The memory defragmentation compaction_minimal algorithm is triggered (see Defragmenting physical memory in the Process Manager chapter of the System Architecture guide)
_NTO_TRACE_SYS_FUNC_ENTER FUNC_ENTER Function Enter A function that's instrumented for profiling is entered
_NTO_TRACE_SYS_FUNC_EXIT FUNC_EXIT Function Exit A function that's instrumented for profiling is exited
_NTO_TRACE_SYS_MAPNAME MAPNAME MMap Name dlopen() is called
_NTO_TRACE_SYS_MMAP MMAP MMap mmap() or mmap64() is called
_NTO_TRACE_SYS_MUNMAP MUNMAP MMUnmap munmap() is called
_NTO_TRACE_SYS_PATHMGR PATHMGR_OPEN Path Manager An operation involving a path name — such as open() — that's routed via the libc connect function occurs. The connect function sends a message to procnto to resolve the path and find the set of resource managers that could potentially match the path. It's upon receiving this message that procnto emits this event.
_NTO_TRACE_SYS_SLOG SLOG System Log A message is written to the system log

You can use the following convenience functions to insert certain System events into the trace data:

trace_func_enter()
Insert an _NTO_TRACE_SYS_FUNC_ENTER event for a function
trace_func_exit()
Insert an _NTO_TRACE_SYS_FUNC_EXIT event for a function
trace_here()
Insert an _NTO_TRACE_SYS_ADDRESS event for the current address

Thread class: _NTO_TRACE_THREAD

The _NTO_TRACE_THREAD class includes events related to state changes for threads:

Event traceprinter label IDE label Emitted when a thread:
_NTO_TRACE_THCONDVAR THCONDVAR Condvar Enters the CONDVAR state
_NTO_TRACE_THCREATE THCREATE Create Thread Is created
_NTO_TRACE_THDEAD THDEAD Dead Enters the DEAD state
_NTO_TRACE_THDESTROY THDESTROY Destroy Thread Is destroyed
_NTO_TRACE_THINTR THINTR Interrupt Enters the INTERRUPT state
_NTO_TRACE_THJOIN THJOIN Join Enters the JOIN state
_NTO_TRACE_THMUTEX THMUTEX Mutex Enters the MUTEX state
_NTO_TRACE_THNANOSLEEP THNANOSLEEP NanoSleep Enters the NANOSLEEP state
_NTO_TRACE_THNET_REPLY THNET_REPLY NetReply Enters the NET_REPLY state
_NTO_TRACE_THNET_SEND THNET_SEND NetSend Enters the NET_SEND state
_NTO_TRACE_THREADY THREADY Ready Enters the READY state
_NTO_TRACE_THRECEIVE THRECEIVE Receive Enters the RECEIVE state
_NTO_TRACE_THREPLY THREPLY Reply Enters the REPLY state
_NTO_TRACE_THRUNNING THRUNNING Running Enters the RUNNING state
_NTO_TRACE_THSEM THSEM Semaphore Enters the SEM state
_NTO_TRACE_THSEND THSEND Send Enters the SEND state
_NTO_TRACE_THSIGSUSPEND THSIGSUSPEND SigSuspend Enters the SIGSUSPEND state
_NTO_TRACE_THSIGWAITINFO THSIGWAITINFO SigWaitInfo Enters the SIGWAITINFO state
_NTO_TRACE_THSTACK THSTACK Stack Enters the STACK state
_NTO_TRACE_THSTOPPED THSTOPPED Stopped Enters the STOPPED state
_NTO_TRACE_THWAITCTX THWAITCTX WaitCtx Enters the WAITCTX state
_NTO_TRACE_THWAITPAGE THWAITPAGE WaitPage Enters the WAITPAGE state
_NTO_TRACE_THWAITTHREAD THWAITTHREAD WaitThread Enters the WAITTHREAD state

If your system includes the adaptive partitioning scheduler module, the data for these events includes the partition ID and scheduling flags. For more information about adaptive partitioning, see the Adaptive Partitioning User's Guide.

For more information about thread states, see Thread life cycle in the QNX Neutrino Microkernel chapter of the System Architecture guide.

User class: _NTO_TRACE_USER

The _NTO_TRACE_USER class includes custom events that your program creates by calling one of the following convenience functions:

trace_logb()
Insert a user combine trace event
trace_logf()
Insert a user string trace event
trace_logi()
Insert a user simple trace event
trace_nlogf()
Insert a user string trace event, specifying a maximum string length
trace_vnlogf()
Insert a user string trace event, using a variable argument list

or by calling TraceEvent() directly, with one of the following commands:

The event must be in the range from _NTO_TRACE_USERFIRST through _NTO_TRACE_USERLAST, but you can decide what each event means.

The traceprinter label for these events is USREVENT; the IDE label is User Event. In both cases, this label is followed by the event type, expressed as an integer.

Virtual thread class: _NTO_TRACE_VTHREAD

The _NTO_TRACE_VTHREAD class includes events related to state changes for virtual threads, special objects related to Transparent Distributed Processing (TDP) over Qnet. The kernel often keeps pointers from different data structures to relevant threads. When those threads are off-node via Qnet, there isn't a local thread object to represent them, so the kernel creates a virtual thread object.

The events for virtual threads are similar to those for normal threads, but virtual threads don't go through the same set of state transitions that normal threads do:

Event traceprinter label IDE label Emitted when a virtual thread:
_NTO_TRACE_VTHCONDVAR VTHCONDVAR VCondvar Enters the CONDVAR state
_NTO_TRACE_VTHCREATE VTHCREATE Create VThread Is created
_NTO_TRACE_VTHDEAD VTHDEAD VDead Enters the DEAD state
_NTO_TRACE_VTHDESTROY VTHDESTROY Destroy VThread Is destroyed
_NTO_TRACE_VTHINTR VTHINTR VInterrupt Enters the INTERRUPT state
_NTO_TRACE_VTHJOIN VTHJOIN VJoin Enters the JOIN state
_NTO_TRACE_VTHMUTEX VTHMUTEX VMutex Enters the MUTEX state
_NTO_TRACE_VTHNANOSLEEP VTHNANOSLEEP VNanosleep Enters the NANOSLEEP state
_NTO_TRACE_VTHNET_REPLY VTHNET_REPLY VNetReply Enters the NET_REPLY state
_NTO_TRACE_VTHNET_SEND VTHNET_SEND VNetSend Enters the NET_SEND state
_NTO_TRACE_VTHREADY VTHREADY VReady Enters the READY state
_NTO_TRACE_VTHRECEIVE VTHRECEIVE VReceive Enters the RECEIVE state
_NTO_TRACE_VTHREPLY VTHREPLY VReply Enters the REPLY state
_NTO_TRACE_VTHRUNNING VTHRUNNING VRunning Enters the RUNNING state
_NTO_TRACE_VTHSEM VTHSEM VSemaphore Enters the SEM state
_NTO_TRACE_VTHSEND VTHSEND VSend Enters the SEND state
_NTO_TRACE_VTHSIGSUSPEND VTHSIGSUSPEND VSigSuspend Enters the SIGSUSPEND state
_NTO_TRACE_VTHSIGWAITINFO VTHSIGWAITINFO VSigWaitInfo Enters the SIGWAITINFO state
_NTO_TRACE_VTHSTACK VTHSTACK VStack Enters the STACK state
_NTO_TRACE_VTHSTOPPED VTHSTOPPED VStopped Enters the STOPPED state
_NTO_TRACE_VTHWAITCTX VTHWAITCTX VWaitCtx Enters the WAITCTX state
_NTO_TRACE_VTHWAITPAGE VTHWAITPAGE VWaitPage Enters the WAITPAGE state
_NTO_TRACE_VTHWAITTHREAD VTHWAITTHREAD VWaitThread Enters the WAITTHREAD state