A simple io_write() example

The io_read() example was fairly simple; let's take a look at io_write(). The major hurdle to overcome with the io_write() is to access the data. Since the resource manager library reads in a small portion of the message from the client, the data content that the client sent (immediately after the _IO_WRITE header) may have only partially arrived at the io_write() function. To illustrate this, consider the client writing one megabyte—only the header and a few bytes of the data will get read by the resource manager library. The rest of the megabyte of data is still available on the client side—the resource manager can access it at will.

There are really two cases to consider:

the entire contents of the client's write() message were read by the resource manager library, or
they were not.

The real design decision, however, is, "how much trouble is it worth to try to save the kernel copy of the data already present?" The answer is that it's not worth it. There are a number of reasons for this:

Message passing (the kernel copy operation) is extremely fast.
There is overhead required to see if the data all fits or not.
There is additional overhead in trying to "save" the first dribble of data that arrived, in light of the fact that more data is waiting.

I think the first two points are self-explanatory. The third point deserves clarification. Let's say the client sent us a large chunk of data, and we did decide that it would be a good idea to try to save the part of the data that had already arrived. Unfortunately, that part is very small. This means that instead of being able to deal with the large chunk all as one contiguous array of bytes, we have to deal with it as one small part plus the rest. Effectively, we have to "special case" the small part, which may have an impact on the overall efficiency of the code that deals with the data. This can lead to headaches, so don't do this!

The real answer, then, is to simply re-read the data into buffers that you've prepared. In our simple io_write() example, I'm just going to malloc() the buffer each time, read the data into the buffer, and then release the buffer via free(). Granted, there are certainly far more efficient ways of allocating and managing buffers!

One further wrinkle introduced in the io_write() example is the handling of the _IO_XTYPE_OFFSET modifier (and associated data; it's done slightly differently than in the io_read() example).

Here's the code:

/*
 * io_write1.c
*/

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/neutrino.h>
#include <sys/iofunc.h>

void
process_data (int offet, void *buffer, int nbytes)
{
    // do something with the data
}

int
io_write (resmgr_context_t *ctp, io_write_t *msg,
          iofunc_ocb_t *ocb)
{
    int     sts;
    int     nbytes;
    int     off;
    int     start_data_offset;
    int     xtype;
    char    *buffer;
    struct _xtype_offset *xoffset;

    // verify that the device is opened for write
    if ((sts = iofunc_write_verify (ctp, msg, ocb, NULL)) != EOK)
    {
        return (sts);
    }

    // 1) check for and handle an XTYPE override
    xtype = msg -> i.xtype & _IO_XTYPE_MASK;
    if (xtype == _IO_XTYPE_OFFSET) {
        xoffset = (struct _xtype_offset *) (&msg -> i + 1);
        start_data_offset = sizeof (msg -> i) + sizeof (*xoffset);
        off = xoffset -> offset;
    } else if (xtype == _IO_XTYPE_NONE) {
        off = ocb -> offset;
        start_data_offset = sizeof (msg -> i);
    } else {   // unknown, fail it
        return (ENOSYS);
    }

    // 2) allocate a buffer big enough for the data
    nbytes = msg -> i.nbytes;
    if ((buffer = malloc (nbytes)) == NULL) {
        return (ENOMEM);
    }

    // 3) (re-)read the data from the client
    if (resmgr_msgread (ctp, buffer, nbytes,
                        start_data_offset) == -1)
    {
        free (buffer);
        return (errno);
    }

    // 4) do something with the data
    process_data (off, buffer, nbytes);

    // 5) free the buffer
    free (buffer);

    // 6) set up the number of bytes for the client's "write"
    // function to return
    _IO_SET_WRITE_NBYTES (ctp, nbytes);

    // 7) if any data written, update POSIX structures and OCB offset
    if (nbytes) {
        ocb -> attr -> flags |= IOFUNC_ATTR_MTIME | IOFUNC_ATTR_DIRTY_TIME;
        if (xtype == _IO_XTYPE_NONE) {
            ocb -> offset += nbytes;
        }
    }

    // 8) tell the resource manager library to do the reply, and that it
    // was okay
    return (EOK);
}

As you can see, a few of the initial operations performed were identical to those done in the io_read() example—the iofunc_write_verify() is analogous to the iofunc_read_verify() function, and the xtype override check is the same.

Step 1

Here we performed much the same processing for the "xtype override" as we did in the io_read() example, except for the fact that the offset is not stored as part of the incoming message structure. The reason it's not stored there is because a common practice is to use the size of the incoming message structure to determine the starting point of the actual data being transferred from the client. We take special pains to ensure the offset of the start of the data (doffset) is correct in the xtype handling code.

Step 2

Here we allocate a buffer that's big enough for the data. The number of bytes that the client is writing is presented to us in the nbytes member of the msg union. This is stuffed automatically by the client's C library in the write() routine. Note that if we don't have sufficient memory to handle the malloc() request, we return the error number ENOMEM to the client—effectively, we're passing on the return code to the client to let it know why its request wasn't completed.

Step 3

Here we use the helper function resmgr_msgread() to read the entire data content from the client directly into the newly allocated buffer. In most cases we could have just used MsgRead(), but in the case where this message is part of a "combine message," resmgr_msgread() performs the appropriate "magic" for us (see the "Combine message" section for more information on why we need to do this.)

The parameters to resmgr_msgread() are fairly straightforward; we give it the internal context pointer (ctp), the buffer into which we want the data placed (buffer), and the number of bytes that we wish read (the nbytes member of the message msg union). The last parameter is the offset into the current message, which we calculated above, in step 1. The offset effectively skips the header information that the client's C library implementation of write() put there, and proceeds directly to the data. This actually brings about two interesting points: dl

We could use an arbitrary offset value to read chunks of the client's data in any order and size we want.
We could use resmgr_msgreadv() (note the "v") to read data from the client into an IOV, perhaps describing various buffers, similar to what we did with the cache buffers in the filesystem discussion in the Message Passing chapter.

Step 4

Here you'd do whatever you want with the data—I've just called a made-up function called process_data() and passed it the buffer and size.

Step 5

This step is crucial! Forgetting to do it is easy, and will lead to "memory leaks." Notice how we also took care to free the memory in the case of a failure in step 3.

Step 6

We're using the macro _IO_SET_WRITE_NBYTES() (see the entry for iofunc_write_verify() in the QNX Neutrino C Library Reference) to store the number of bytes we've written, which will then be passed back to the client as the return value from the client's write(). It's important to note that you should return the actual number of bytes! The client is depending on this.

Step 7

Now we do similar housekeeping for stat(), lseek(), and further write() functions as we did for the io_read() routine (and again, we modify the offset in the ocb only in the case of this not being a _IO_XTYPE_OFFSET type of message). Since we're writing to the device, however, we use the IOFUNC_ATTR_MTIME constant instead of the IOFUNC_ATTR_ATIME constant. The MTIME flag means "modification" time, and a write() to a resource certainly "modifies" it.

Step 8

The last step is simple: we return the constant EOK, which tells the resource manager library that it should reply to the client. This ends our processing. The resource manager will use the number of bytes that we stashed away with the _IO_SET_WRITE_NBYTES() macro in the reply and the client will unblock; the client's C library write() function will return the number of bytes that were written by our device.