Kernel Buffer Management

This chapter includes:

Instrumented kernel and kernel buffer management
Buffer specifications
Ring of buffers

Instrumented kernel and kernel buffer management

Ring of buffers

The kernel buffers.

As the instrumented kernel intercepts events, it stores them in a ring of buffers. As each buffer fills, the instrumented kernel raises an _NTO_HOOK_TRACE synthetic interrupt to notify the data-capturing program that the buffer is ready to be read.

Buffer specifications

Each buffer is of a fixed size and is divided into a fixed number of slots:

Event buffer slots per buffer	1024
Event buffer slot size	16 bytes
Buffer size	16 KB

Some events are single buffer slot events (“simple events”) while others are multiple buffer slot events (“combine events”). In either case there is only one event, but the number of event buffer slots required to describe it may vary.

For details, see the Interpreting Trace Data chapter.

Ring of buffers

Ring buffer size

Although the size of the buffers is fixed, the maximum number of buffers used by a system is limited only by the amount of memory. (The tracelogger utility uses a default setting of 32 buffers, or about 500 KB of memory.)

The buffers share kernel memory with the application(s), and the kernel automatically allocates memory at the request of the data-capture utility. The kernel allocates the buffers in contiguous physical memory space. If the data-capture program requests a larger block than is available contiguously, the instrumented kernel returns an error message.

For all intents and purposes, the number of events the instrumented kernel generates is infinite. Except for severe filtering or logging for only a few seconds, the instrumented kernel will probably exhaust the ring of buffers, no matter how large it is. To allow the instrumented kernel to continue logging indefinitely, the data-capture program must continuously pipe (empty) the buffers.

Full buffers and the high-water mark

As each buffer becomes full, the instrumented kernel raises an _NTO_HOOK_TRACE synthetic interrupt to notify the data-capturing program to save the buffer. Because the buffer size is fixed, the kernel sends only the buffer address; the length is constant.

The instrumented kernel can't flush a buffer or change buffers within an interrupt. If the interrupt wasn't handled before the buffer became 100% full, some of the events may be lost. To ensure this never happens, the instrumented kernel requests a buffer flush at the high-water mark.

The high-water mark is set at an efficient, yet conservative, level:

around 70% (_TRACEBUF_MAX_EVENTS) for linear mode
around 90% (_TRACEBUF_MAX_EVENTS_RING) for ring mode

Most interrupt routines require fewer than 300 event buffer slots (approximately 30% of 1024 event buffer slots), so there's virtually no chance that any events will be lost. (The few routines that use extremely long interrupts should include a manual buffer-flush request in their code.)

Therefore, in a normal system, the kernel logs about 715 events of the fixed maximum of 1024 events before notifying the capture program.

Buffer overruns

The instrumented kernel is both the very core of the system and the controller of the event buffers.

When the instrumented kernel is busy, it logs more events. The buffers fill more quickly, and the instrumented kernel requests that the buffers be flushed more often. The data-capture program handles each flush request; the instrumented kernel switches to the next buffer and continues logging events. In an extremely busy system, the data-capture program may not be able to flush the buffers as quickly as the instrumented kernel fills them.

In a three-buffer scenario, the instrumented kernel fills buffer 1 and raises an _NTO_HOOK_TRACE synthetic interrupt to notify the data-capture program that the buffer is full. The data-capture program takes “ownership” of buffer 1 and the instrumented kernel marks the buffer as “busy/in use.” If, say, the file is being saved to a hard drive that happens to be busy, then the instrumented kernel may fill buffer 2 and buffer 3 before the data-capture program can release buffer 1. In this case, the instrumented kernel skips buffer 1 and writes to buffer 2. The previous contents of buffer 2 are overwritten and the timestamps on the event buffer slots will show a discontinuity.

For more on buffer overruns, see the Tutorials chapter.