Returning directory entries from _IO_READ

Updated: April 19, 2023

When the _IO_READ handler is called, it may need to return data for either a file (if S_ISDIR (ocb->attr->mode) is false) or a directory (if S_ISDIR (ocb->attr->mode) is true). The readdir() function sets _IO_XTYPE_READDIR in the xtype member of the _IO_READ message. We've seen the algorithm for returning data, especially the method for matching the returned data's size to the smaller of the data available or the client's buffer size.

A similar constraint is in effect for returning directory data to a client, except we have the added issue of returning block-integral data. What this means is that instead of returning a stream of bytes, where we can arbitrarily package the data, we're actually returning a number of struct dirent structures. (In other words, we can't return 1.5 of those structures; we always have to return an integral number.) The dirent structures must be properly aligned in the reply, and aligning on 8-byte boundaries ensures this will be the case.

A struct dirent looks like this:

struct dirent {
#if _FILE_OFFSET_BITS - 0 == 64
    ino_t           d_ino;          /* File serial number. */
    off_t           d_offset;
#elif !defined(_FILE_OFFSET_BITS) || _FILE_OFFSET_BITS == 32
#if defined(__LITTLEENDIAN__)
    ino_t           d_ino;          /* File serial number. */
    ino_t           d_ino_hi;
    off_t           d_offset;
    off_t           d_offset_hi;
#elif defined(__BIGENDIAN__)
    ino_t           d_ino_hi;
    ino_t           d_ino;          /* File serial number. */
    off_t           d_offset_hi;
    off_t           d_offset;
#else
 #error endian not configured for system
#endif
#else
 #error _FILE_OFFSET_BITS value is unsupported
#endif
    int16_t             d_reclen;
    int16_t             d_namelen;
    char                d_name[1];
};

The d_ino member contains a mountpoint-unique file serial number. This serial number is often used in various disk-checking utilities for such operations as determining infinite-loop directory links. (Note that the inode value should not be zero, as this is commonly used to indicate that the inode represents an unused entry.)

The d_offset member represents the offset to this entry within the directory. The exact meaning and implementation of d_offset is up to the resource manager, but it is expected that if a client reads a directory entry, issues a seek message with the returned d_offset, then reads again, the client will get back the same directory entry.

The d_reclen member contains the size of this directory entry including the variable length name member, any padding to maintain proper alignment of elements, and any other associated information (such as an optional struct stat structure appended to the struct dirent entry; see below).

The d_namelen parameter indicates the size of the d_name parameter, not including the \0 string terminator which must be present, but isn't counted. This size is usually calculated with a call to strlen() if not already stored.

So in our io_read handler, we need to generate a number of struct dirent entries and return them to the client. If we have a cache of directory entries that we maintain in our resource manager, it's a simple matter to construct a set of IOVs to point to those entries. But more often we don't have a cache, and we must manually assemble the directory entries into a buffer and then return that to the client.

To build a directory entry:

#define ROUNDUP(s,n) (((n) & ((s)-1)) ? ((n) | ((s)-1)) + 1 : (n))
struct dirent *dir_p = (struct dirent *)ctp->msg; // use receive buffer as scratch buffer
file_len = strlen(attr->my_name);
dirent_len = offsetof(struct dirent, d_name) + file_len + 1 ; // minimum length
dirent_len = ROUNDUP(8, dirent_len); // must pad to 8-byte alignment
memset(dir_p, 0, dirent_len);
dir_p->d_ino = attr->inode;
dir_p->d_offset = ocb->offset;
dir_p->d_reclen = dirent_len;
dir_p->d_namelen = file_len;
strcpy(d->d_name, attr->my_name);
Note: If you cache dirent structures in your resource manager, you must also be careful to provide adequate storage for the full name string associated with the struct dirent.