Combine messages

As it turns out, this concept of combine messages isn't useful just for saving bandwidth (as in the chown() case, above). It's also critical for ensuring atomic completion of operations.

Suppose the client process has two or more threads and one file descriptor. One of the threads in the client does an lseek() followed by a read(). Everything is as we expect it. If another thread in the client does the same set of operations, on the same file descriptor, we'd run into problems. Since the lseek() and read() functions don't know about each other, it's possible that the first thread would do the lseek(), and then get preempted by the second thread. The second thread gets to do its lseek(), and then its read(), before giving up CPU. The problem is that since the two threads are sharing the same file descriptor, the first thread's lseek() offset is now at the wrong place — it's at the position given by the second thread's read() function! This is also a problem with file descriptors that are dup()'d across processes, let alone the network.

An obvious solution to this is to put the lseek() and read() functions within a mutex — when the first thread obtains the mutex, we now know that it has exclusive access to the file descriptor. The second thread has to wait until it can acquire the mutex before it can go and mess around with the position of the file descriptor.

Unfortunately, if someone forgot to obtain a mutex for each and every file descriptor operation, there'd be a possibility that such an "unprotected" access would cause a thread to read or write data to the wrong location.

Let's look at the C library call readblock() (from <unistd.h>):

int
readblock (int fd,
           size_t blksize,
           unsigned block,
           int numblks,
           void *buff);

(The writeblock() function is similar.)

You can imagine a fairly "simplistic" implementation for readblock():

int
readblock (int fd, size_t blksize, unsigned block,
           int numblks, void *buff)
{
    lseek (fd, blksize * block, SEEK_SET); // get to the block
    read (fd, buff, blksize * numblks);
}

Obviously, this implementation isn't useful in a multi-threaded environment. We'd have to at least put a mutex around the calls:

int
readblock (int fd, size_t blksize, unsigned block,
           int numblks, void *buff)
{
    pthread_mutex_lock (&block_mutex);
    lseek (fd, blksize * block, SEEK_SET); // get to the block
    read (fd, buff, blksize * numblks);
    pthread_mutex_unlock (&block_mutex);
}

(We're assuming the mutex is already initialized.)

This code is still vulnerable to "unprotected" access; if some other thread in the process does a simple non-mutexed lseek() on the file descriptor, we've got a bug.

The solution to this is to use a combine message, as we discussed above for the chown() function. In this case, the C library implementation of readblock() puts both the lseek() and the read() operations into a single message and sends that off to the resource manager:

Figure 1. The readblock() function's combine message.

The reason that this works is because message passing is atomic. From the client's point of view, either the entire message has gone to the resource manager, or none of it has. Therefore, an intervening "unprotected" lseek() is irrelevant — when the readblock() operation is received by the resource manager, it's done in one shot. (Obviously, the damage will be to the unprotected lseek(), because after the readblock() the file descriptor's offset is at a different place than where the original lseek() put it.)

But what about the resource manager? How does it ensure that it processes the entire readblock() operation in one shot? We'll see this shortly, when we discuss the operations performed for each message component.