Throughput

Another key point is the performance of sequential access to a file, or raw throughput, where a large amount of data is written to a file (or an entire file is read). The filesystem itself can detect this type of sequential access and attempt to optimize the use of the disk, by doing:

read-ahead on reads, so that the disk is being accessed for the predicted new data while the user processes the original data
write-behind of writes to allow a large amount of dirty data to be coalesced into a single contiguous multiple-block write

The most efficient way of accessing the disk for high-performance is through the standard POSIX routines that work with file descriptors—open(), read(), and write()—because these allow direct access to the filesystem with no interference from libc.

If you're concerned about performance, we don't recommend that you use the standard I/O (<stdio.h>) routines that work with FILE variables, because they introduce another layer of code and another layer of buffering. In particular, the default buffer size is BUFSIZ, or 1 KB, so all access to the disk is carved up into chunks of that size, causing a large amount of overhead for passing messages and switching contexts.

There are some cases when the standard I/O facilities are useful, such as when processing a text file one line or character at a time, in which case the 1 KB of buffering provided by standard I/O greatly reduces the number of messages to the filesystem. You can improve performance by using:

setvbuf() or the STDIO_DEFAULT_BUFSIZE environment variable to increase the buffering size
fileno() to access the underlying file descriptor directly and to bypass the buffering during performance-critical sections

You can also optimize performance by accessing the disk in suitably sized chunks (large enough to minimize the overheads of QNX Neutrino's context-switching and message-passing, but not too large to exceed disk driver limits for blocks per operation or overheads in large message-passing); an optimal size is 32 KB.

You should also access the file on block boundaries for whole multiples of a disk sector (since the smallest unit of access to a disk/block device is a single sector, partial writes will require a read/modify/write cycle); you can get the optimal I/O size by calling statvfs(), although most disks are 512 bytes/sector.

Finally, for very high performance situations (video streaming, etc.) it's possible to bypass all buffering in the filesystem and perform DMA directly between the user data areas and the disk. But note these caveats:

The disk and disk driver must support such access.
No coherency is offered between data transferred directly and any data in the filesystem buffer cache.
Some POSIX semantics (such as file access or modification time updates) are ignored.

We don't currently recommend that you use DMA unless absolutely necessary; not all disk drivers correctly support it, so there's no facility to query a disk driver for the DMA-safe requirements of its interface, and naive users can get themselves into trouble!

In some situations, where you know the total size of the final data file, it can be advantageous to pregrow it to this size, rather than allow it to be automatically extended piecemeal by the filesystem as it is written to. This lets the filesystem see a single explicit request for allocation instead of many implicit incremental updates; some filesystems may be able to exploit this and allocate the file in a more optimal/contiguous fashion. It also reduces the number of metadata updates needed during the write phase, and so, improves the data write performance by not disrupting sequential streaming.

The POSIX function to extend a file is ftruncate(); the standard requires this function to zero-fill the new data space, meaning that the file is effectively written twice, so this technique is suitable when you can prepare the file during an initial phase where performance isn't critical. There's also a non-POSIX devctl() to extend a file without zero-filling it, which provides the above benefits without the cost of erasing the contents; the DCMD_FSYS_PREGROW_FILE command, which is defined in <sys/dcmd_blk.h> and described in the Devctl and Ioctl Commands reference, takes as its argument the file size, as a off64_t. For example:

int fd;
off64_t sz;

fd=open(...);
sz=...;

devctl(fd, DCMD_FSYS_PREGROW_FILE, &sz, sizeof(sz), NULL);