Memory leaks

The ability of the librcheck library to keep full allocation chains of all the heap memory allocated by the program—as opposed to just accounting for some heap buffers—allows heap memory leaks to be detected by the library in response to requests by the program. Leaks can be detected in the program by performing tracing on the entire heap.

Tracing is an operation that attempts to determine whether a heap object is reachable by the program. In order to be reachable, a heap buffer must be available either directly or indirectly from a pointer in a global variable or on the stack of one of the threads. If this isn't the case, then the heap buffer is no longer visible to the program and can't be accessed without constructing a pointer that refers to the heap buffer—presumably by obtaining it from a persistent store such as a file or a shared memory object.

The set of global variables and stack for all threads is called the root set. Because the root set must be stable for tracing to yield valid results, tracing requires that all threads other than the one performing the trace be suspended while the trace is performed.

Tracing operates by constructing a reachability graph of the entire heap. It begins with a root set scan that determines the root set comprising the initial state of the reachability graph. The roots that can be found by tracing are:

data of the program
uninitialized data of the program
initialized and uninitialized data of any shared objects dynamically linked into the program
used portion of the stacks of all active threads in the program

Once the root set scan is complete, tracing initiates a mark operation for each element of the root set. The mark operation looks at a node of the reachability graph, scanning the memory space represented by the node, looking for pointers into the heap. Since the program may not actually have a pointer directly to the start of the buffer—but to some interior location—and it isn't possible to know which part of the root set or a heap object actually contains a pointer, tracing uses specialized techniques for coping with ambiguous roots. The approach taken is described as a conservative pointer estimation since it assumes that any word-sized object on a word-aligned memory cell that could point to a heap buffer or the interior of that heap buffer actually points to the heap buffer itself.

Using conservative pointer estimation for dealing with ambiguous roots, the mark operation finds all children of a node of the reachability graph. For each child in the heap that's found, it checks to see whether the heap buffer has been marked as referenced. If the buffer has been marked, the operation moves on to the next child. Otherwise, the trace marks the buffer, and then recursively initiates a mark operation on that heap buffer.

The tracing operation is complete when the reachability graph has been fully traversed. At this time every heap buffer that's reachable will have been marked, as could some buffers that aren't actually reachable, due to the conservative pointer estimation. Any heap buffer that hasn't been marked is definitely unreachable, constituting a memory leak. At the end of the tracing operation, all unmarked nodes can be reported as leaks.

Starting a trace and giving results

A program can cause a trace to be performed and memory leaks to be reported by calling the librcheck version of mallopt() with a command of MALLOC_DUMP_LEAKS:

mallopt( MALLOC_DUMP_LEAKS, 1);

The second argument is ignored.

Analyzing dumps

The dump of unreferenced buffers prints out one line of information for each unreferenced buffer. The information provided for a buffer includes:

the address of the buffer
the function that was used to allocate it (malloc(), calloc(), realloc())
the file that contained the allocation request, if available
the line number or return address of the call to the allocation function
the size of the allocated buffer

File and line information is available if the call to allocate the buffer was made using one of the library's debug interfaces. Otherwise, the return address of the call is reported in place of the line number. In some circumstances, no return address information is available. This usually indicates that the call was made from a function with no frame information, such as the system libraries. In such cases, the entry can usually be ignored and probably isn't a leak.

From the way tracing is performed, we can see that some leaks may escape detection and may not be reported in the output. This happens if the root set or a reachable buffer in the heap has something that looks like a pointer to the buffer.

Likewise, you should check each reported leak against the suspected code identified by the line or call return address information. If the code in question keeps interior pointers—pointers to a location inside the buffer, rather than the start of the buffer—the trace operation will likely fail to find a reference to the buffer. In this case, the buffer may well not be a leak. In other cases, there is almost certainly a memory leak.