Dynamic memory management

Let's examine how the heap is used for dynamically allocated memory.

You request memory buffers or blocks of a particular size from the runtime environment by using malloc(), realloc(), or calloc(), and you release them back to the runtime environment when you no longer need them by using free(). The C++ new and delete operators are built on top of malloc() and free(), so this discussion applies to them as well.

The memory allocator ensures that your requests are satisfied by managing a region of the program's memory area known as the heap. In this heap, the allocator tracks all of the information—such as the size of the original block—about the blocks and heap buffers that it's allocated to your program, in order that it can make the memory available to you during subsequent allocation requests. When a block is released, the allocator places it on a list of available blocks called a free list. It usually keeps the information about a block in the header that precedes the block itself in memory.

The runtime environment grows the size of the heap when it no longer has enough memory available to satisfy allocation requests, and it may return memory from the heap to the OS when the program releases memory.

The basic heap allocation mechanism is broken up into two separate pieces, a chunk-based small block allocator and a list-based large block allocator. By configuring specific parameters, you can select the sizes for the chunks in the small block allocator and also the boundary between the small and large allocators.

Arena allocations

Both the small and large block allocators allocate and deallocate memory from the OS in the form of chunks known as arenas, by calling mmap() and munmap(). By default, the arena size is 32 KB. It must be a multiple of 4 KB and must currently be less than 256 KB. If your program requests a block that's larger than an arena, the allocator gets a block whose size is a multiple of the arena size from the process manager, gives your program a block of the requested size, and puts any remaining memory on a free list.

You can configure this parameter by doing one of the following:

setting the _amblksiz global variable (e.g., _amblksiz = 16384;)
calling mallopt() with MALLOC_ARENA_SIZE as the command (e.g., mallopt(MALLOC_ARENA_SIZE, 16384);)
setting the MALLOC_ARENA_SIZE environment variable (e.g., export MALLOC_ARENA_SIZE=16384)

Note: The MALLOC_* environment variables are checked only at program startup, but changing them is the easiest way to configure the allocator so that these parameters are used for allocations that occur before main().

The allocator also attempts to cache recently freed blocks. In QNX Neutrino 6.6 or later, this cache is used only for blocks that are the current arena size or smaller. You can configure the arena cache by setting the following environment variables:

MALLOC_ARENA_CACHE_MAXBLK: The number of cached blocks.
MALLOC_ARENA_CACHE_MAXSZ: The total size of the cached blocks, in bytes.

Alternatively, you can call:

mallopt(MALLOC_ARENA_CACHE_MAXSZ, size);
mallopt(MALLOC_ARENA_CACHE_MAXBLK, number);

Note: There's a difference between setting these environment variables and using the corresponding mallopt() commands:

If you don't want the allocator to cache any memory at all, call mallopt() with a command of MALLOC_ARENA_CACHE_MAXBLK and a value of 0.
If you set the MALLOC_ARENA_CACHE_MAXSZ or MALLOC_ARENA_CACHE_MAXBLK environment variable to 0, the allocator ignores the setting.

To tell the allocator to never release memory back to the OS, you can set the MALLOC_MEMORY_HOLD environment variable to 1:

export MALLOC_MEMORY_HOLD=1

or call:

mallopt(MALLOC_MEMORY_HOLD, 1);

Once you've used mallopt() to change the values of MALLOC_ARENA_CACHE_MAXSZ and MALLOC_ARENA_CACHE_MAXBLK, you can call mallopt() again with a command of MALLOC_ARENA_CACHE_FREE_NOW to immediately adjust the arena cache. The behavior depends on the value argument:

1: The arena cache is adjusted immediately, and all cached memory that can be freed to the OS is released. Exactly what can be freed depends on how the allocations up to that point have been laid out in memory.
0: The arena cache is adjusted immediately to correspond to the current settings. Enough cache blocks are freed to match the adjusted MALLOC_ARENA_CACHE_MAXBLK value.

If you don't use the MALLOC_ARENA_CACHE_FREE_NOW command, the changes made to the cache parameters take effect whenever memory is subsequently released to the cache.

You can preallocate and populate the arena cache by setting the MALLOC_MEMORY_PREALLOCATE environment variable to a value that specifies the size of the total arena cache. The cache is populated by multiple arena allocation calls in chunks whose size is specified by the value of MALLOC_ARENA_SIZE.

The preallocation option doesn't alter the MALLOC_ARENA_CACHE_MAXBLK and MALLOC_ARENA_CACHE_MAXSZ options. So if you preallocate 10 MB of memory in cache blocks, and you want to ensure that this memory stays in the application throughout the lifetime of the application, you should also set the values of MALLOC_ARENA_CACHE_MAXBLK and MALLOC_ARENA_CACHE_MAXSZ to something appropriate.

Large block allocator

The large block allocator uses a free list to keep track of any available blocks. To minimize fragmentation, the allocator uses a first-fit algorithm to determine which block to use to service a request. If the allocator doesn't have a block that's large enough, it uses mmap() to get memory from the OS in multiples of the arena size, and then carves out the appropriate user pieces from this, putting the remaining memory onto the free list.

If all the memory that makes up an arena is eventually freed, the arena is returned to the OS. In QNX Neutrino 6.6 or later, when you free a block of memory that's larger than the arena size, the allocator uses munmap() to immediately return the block to the OS (unless you've set MALLOC_MEMORY_HOLD to 1 to prevent the allocator from releasing freed memory back to the OS). Freed blocks that are the arena size or smaller are cached according to the cache settings.

Small block allocator

The small block allocator manages a pool of memory blocks of different sizes. These blocks are arranged into linked lists called bands; each band contains blocks that are the same size. When your program allocates a small amount of memory, the small block allocator returns a block from the band that best fits your request. Allocations larger than the largest band size are serviced by the large allocator. If there are no more blocks available in the band, the allocator uses mmap() to get an arena from the OS and then divides it into blocks of the required size.

The allocator initially adjusts all band sizes to be multiples of _MALLOC_ALIGN (which is 8). The allocator normalizes the size of each pool so that each band has as many blocks as can be carved from a 4 KB piece of memory, taking into account alignment restrictions and overhead needed by the allocator to manage the blocks. The default band sizes and pool sizes are as follows:

Band size	Number of blocks
_MALLOC_ALIGN × 2 = 16	167
_MALLOC_ALIGN × 3 = 24	125
_MALLOC_ALIGN × 4 = 32	100
_MALLOC_ALIGN × 6 = 48	71
_MALLOC_ALIGN × 8 = 64	55
_MALLOC_ALIGN × 10 = 80	45
_MALLOC_ALIGN × 12 = 96	38
_MALLOC_ALIGN × 16 = 128	28

Note: You might also see references to bins, which are allocation ranges that you want to collect statistics for. For example, you can check how many allocations are done for 40, 80, and 120 byte bins. The default bins are 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, and ULONG_MAX (the last bin catches all allocations larger than 4096 bytes). The bins are completely independent of the bands.

When used in conjunction with the MALLOC_MEMORY_PREALLOCATE option for the arena cache, the preallocation of blocks in bands is performed by initially populating the arena cache, and then allocating bands from this arena cache.

You can configure the bands by setting the MALLOC_BAND_CONFIG_STR environment variable to a string in this format:

N:s1,n1,p1:s2,n2,p2:s3,n3,p3: ... :sN,nN,pN

where the components are:

N: The number of bands.
s: The band size.
n: The number of blocks in the band.
p: The number of blocks to preallocate, which can be zero.

The parsing is simple and strict:

The sizes must all be distinct and be provided in ascending order (i.e., s1 < s2 < s3, and so on).
You must specify s, n, and p for each band.
The string can't include any spaces; the only valid characters are digits, colons (:), and commas (,).

If the allocator doesn't like the string, it ignores it completely.

For example, setting MALLOC_BAND_CONFIG_STR to:

8:2,32,60:15,32,60:29,32,60:55,24,60:100,24,60:130,24,60:260,8,60:600,4,60

specifies these bands, with 60 blocks preallocated for each band:

Band size	Number of blocks
2	32
15	32
29	32
55	24
100	24
130	24
260	8
600	4

The allocator normalizes this configuration to:

Band size	Number of blocks
8	251
16	167
32	100
56	62
104	35
136	27
264	13
600	5

For the above configuration, allocations larger than 600 bytes are serviced by the large block allocator.