![]() |
![]() |
![]() |
![]() |
This chapter contains the following topics:
![]() |
This chapter assumes that you're familiar with message passing. If you're not, see the Neutrino Microkernel chapter in the System Architecture book as well as the MsgSend(), MsgReceivev(), and MsgReply() series of calls in the Library Reference. The code samples used in this chapter are not always POSIX-compliant. |
This section contains the following:
A resource manager is a user-level server program that accepts messages from other programs and, optionally, communicates with hardware. It's a process that registers a pathname prefix in the pathname space (e.g. /dev/ser1), and when registered, other processes can open that name using the standard C library open() function, and then read() from, and write() to, the resulting file descriptor. When this happens, the resource manager receives an open request, followed by read and write requests.
A resource manager isn't restricted to handling just open(), read(), and write() calls -- it can support any functions that are based on a file descriptor or file pointer, as well as other forms of IPC.
In Neutrino, resource managers are responsible for presenting an interface to various types of devices. In other operating systems, the managing of actual hardware devices (e.g. serial ports, parallel ports, network cards, and disk drives) or virtual devices (e.g. /dev/null, a network filesystem, and pseudo-ttys), is associated with device drivers. But unlike device drivers, the Neutrino resource managers execute as processes separate from the kernel.
![]() |
A resource manager looks just like any other user-level program. |
Adding resource managers in Neutrino won't affect any other part of the OS -- the drivers are developed and debugged like any other application. And since the resource managers are in their own protected address space, a bug in a device driver won't cause the entire OS to shut down.
If you've written device drivers in most UNIX variants, you're used to being restricted in what you can do within a device driver; but since a device driver in Neutrino is just a regular process, you aren't restricted in what you can do (except for the restrictions that exist inside an ISR).
![]() |
In order to register a prefix in the pathname space, a resource manager must be run as root. |
A serial port may be managed by a resource manager called devc-ser8250, although the actual resource may be called /dev/ser1 in the pathname space. When a process requests serial port services, it does so by opening a serial port (in this case /dev/ser1).
fd = open("/dev/ser1", O_RDWR);
for (packet = 0; packet < npackets; packet++)
write(fd, packets[packet], PACKET_SIZE);
close(fd);
Because resource managers execute as processes, their use isn't restricted to device drivers -- any server can be written as a resource manager. For example, a server that's given DVD files to display in a GUI interface wouldn't be classified as a driver, yet it could be written as a resource manager. It can register the name /dev/dvd and as a result, clients can do the following:
fd = open("/dev/dvd", O_WRONLY);
while (data = get_dvd_data(handle, &nbytes)) {
bytes_written = write(fd, data, nbytes);
if (bytes_written != nbytes) {
perror ("Error writing the DVD data");
}
}
close(fd);
Here are a few reasons why you'd want to write a resource manager:
The API for communicating with the resource manager is for the most part, POSIX. All C programmers are familiar with the open(), read(), and write() functions. Training costs are minimized, and so is the need to document the interface to your server.
If you have many server processes, writing each server as a resource manager keeps the number of different interfaces that clients need to use to a minimum.
An example of this is if you have a team of programmers building your overall application, and each programmer is writing one or more servers for that application. These programmers may work directly for your company, or they may belong to partner companies who are developing add-on hardware for your modular platform.
If the servers are resource managers, then the interface to all of those servers is the POSIX functions: open(), read(), write(), and whatever else makes sense. For control-type messages that don't fit into a read/write model, there's devctl() (although devctl() isn't POSIX).
Since the API for communicating with a resource manager is the POSIX set of functions, and since standard POSIX utilities use this API, the utilities can be used for communicating with the resource managers.
For instance, the tiny TCP/IP protocol module contains resource-manager code that registers the name /proc/ipstats. If you open this name and read from it, the resource manager code responds with a body of text that describes the statistics for IP.
The cat utility takes the name of a file and opens the file, reads from it, and displays whatever it reads to standard output (typically the screen). As a result, you can type:
cat /proc/ipstats
The resource manager code in the TCP/IP protocol module responds with text such as:
Ttcpip Sep 5 2000 08:56:16
verbosity level 0
ip checksum errors: 0
udp checksum errors: 0
tcp checksum errors: 0
packets sent: 82
packets received: 82
lo0 : addr 127.0.0.1 netmask 255.0.0.0 up
DST: 127.0.0.0 NETMASK: 255.0.0.0 GATEWAY: lo0
TCP 127.0.0.1.1227 > 127.0.0.1.6000 ESTABLISHED snd 0 rcv 0
TCP 127.0.0.1.6000 > 127.0.0.1.1227 ESTABLISHED snd 0 rcv 0
TCP 0.0.0.0.6000 LISTEN
You could also use command-line utilities for a robot-arm driver. The driver could register the name, /dev/robot/arm/angle, and any writes to this device are interpreted as the angle to set the robot arm to. To test the driver from the command line, you'd type:
echo 87 >/dev/robot/arm/angle
The echo utility opens /dev/robot/arm/angle and writes the string ("87") to it. The driver handles the write by setting the robot arm to 87 degrees. Note that this was accomplished without writing a special tester program.
Another example would be names such as /dev/robot/registers/r1, r2, ... Reading from these names returns the contents of the corresponding registers; writing to these names set the corresponding registers to the given values.
Even if all of your other IPC is done via some non-POSIX API, it's still worth having one thread written as a resource manager for responding to reads and writes for doing things as shown above.
Despite the fact that you'll be using a resource manager API that hides many details from you, it's still important to understand what's going on under the covers. For example, your resource manager is a server that contains a MsgReceive() loop, and clients send you messages using MsgSend*(). This means that you must reply either to your clients in a timely fashion, or leave your clients blocked but save the rcvid for use in a later reply.
To help you understand, we'll discuss the events that occur under the covers for both the client and the resource manager.
When a client calls a function that requires pathname resolution (e.g. open(), rename(), stat(), or unlink()), the function subsequently sends messages to both the process and the resource managers to obtain a file descriptor. Once the file descriptor is obtained, the client can use it to send messages directly to the device associated with the pathname.
In the following, the file descriptor is obtained and then the client writes directly to the device:
/*
* In this stage, the client talks
* to the process manager and the resource manager.
*/
fd = open("/dev/ser1", O_RDWR);
/*
* In this stage, the client talks directly to the
* resource manager.
*/
for (packet = 0; packet < npackets; packet++)
write(fd, packets[packet], PACKET_SIZE);
close(fd);
For the above example, here's the description of what happened behind the scenes. We'll assume that a serial port is managed by a resource manager called devc-ser8250, that's been registered with the pathname prefix /dev/ser1:

Under-the-cover communication between the client, the process manager, and the resource manager.
Here's what went on behind the scenes...
When the devc-ser8250 resource manager registered its name (/dev/ser1)
in the namespace, it called the process manager.
The process manager is responsible for maintaining information about pathname prefixes.
During registration, it adds an entry to its table that looks similar
to this:
0, 47167, 1, 0, 0, /dev/ser1
The table entries represent:
A resource manager is uniquely identified by a node descriptor, process ID, and a channel ID. The process manager's table entry associates the resource manager with a name, a handle (to distinguish multiple names when a resource manager registers more than one name), and an open type.
When the client's library issued the query call in step 1, the process manager looked through all of its tables for any registered pathname prefixes that match the name. Previously, had another resource manager registered the name /, more than one match would be found. So, in this case, both / and /dev/ser1 match. The process manager will reply to the open() with the list of matched servers or resource managers. The servers are queried in turn about their handling of the path, with the longest match being asked first.
fd = ConnectAttach(nd, pid, chid, 0, 0);
The file descriptor that's returned by ConnectAttach() is also a connection ID and is used for sending messages directly to the resource manager. In this case, it's used to send a connect message (_IO_CONNECT defined in <sys/iomsg.h>) containing the handle to the resource manager requesting that it open /dev/ser1.
![]() |
Typically, only functions such as open() call ConnectAttach() with an index argument of 0. Most of the time, you should OR _NTO_SIDE_CHANNEL into this argument, so that the connection is made via a side channel, resulting in a connection ID that's greater than any valid file descriptor. |
When the resource manager gets the connect message, it performs validation using the access modes specified in the open() call (i.e. are you trying to write to a read-only device?, etc.)
In the sample code, it looks as if the client opens and writes directly to the device. In fact, the write() call sends an _IO_WRITE message to the resource manager requesting that the given data be written, and the resource manager responds that it either wrote some of all of the data, or that the write failed.
Eventually, the client calls close(), which sends an _IO_CLOSE_DUP message to the resource manager. The resource manager handles this by doing some cleanup.
The resource manager is a server that uses the Neutrino send/receive/reply messaging protocol to receive and reply to messages. The following is pseudo-code for a resource manager:
initialize the resource manager
register the name with the process manager
DO forever
receive a message
SWITCH on the type of message
CASE _IO_CONNECT:
call io_open handler
ENDCASE
CASE _IO_READ:
call io_read handler
ENDCASE
CASE _IO_WRITE:
call io_write handler
ENDCASE
. /* etc. handle all other messages */
. /* that may occur, performing */
. /* processing as appropriate */
ENDSWITCH
ENDDO
Many of the details in the above pseudo-code are hidden from you by a resource manager library that you'll use. For example, you won't actually call a MsgReceive*() function -- you'll call a library function, such as resmgr_block() or dispatch_block(), that does it for you. If you're writing a single-threaded resource manager, you might provide a message handling loop, but if you're writing a multi-threaded resource manager, the loop is hidden from you.
You don't need to know the format of all the possible messages, and you don't have to handle them all. Instead, you register "handler functions," and when a message of the appropriate type arrives, the library calls your handler. For example, suppose you want a client to get data from you using read() -- you'll write a handler that's called whenever an _IO_READ message is received. Since your handler handles _IO_READ messages, we'll call it an "io_read handler."
The resource manager library:
However, it's still your responsibility to reply to the _IO_READ message. You can do that from within your io_read handler, or later on when data arrives (possibly as the result of an interrupt from some data-generating hardware).
The library does default handling for any messages that you don't want to handle. After all, most resource managers don't care about presenting proper POSIX filesystems to the clients. When writing them, you want to concentrate on the code for talking to the device you're controlling. You don't want to spend a lot of time worrying about the code for presenting a proper POSIX filesystem to the client.
In considering how much work you want to do yourself in order to present a proper POSIX filesystem to the client, you can break resource managers into two types:
Device resource managers create only single-file entries in the filesystem, each of which is registered with the process manager. Each name usually represents a single device. These resource managers typically rely on the resource-manager library to do most of the work in presenting a POSIX device to the user.
For example, a serial port driver registers names such as /dev/ser1 and /dev/ser2. When the user does ls -l /dev, the library does the necessary handling to respond to the resulting _IO_STAT messages with the proper information. The person who writes the serial port driver is able to concentrate instead on the details of managing the serial port hardware.
Filesystem resource managers register a mountpoint with the process manager. A mountpoint is the portion of the path that's registered with the process manager. The remaining parts of the path are managed by the filesystem resource manager. For example, when a filesystem resource manager attaches a mountpoint at /mount, and the path /mount/home/thomasf is examined:
Examples of using filesystem resource managers are:
A resource manager is composed of some of the following layers:
This top layer consists of a set of functions that take care of most of the POSIX filesystem details for you -- they provide a POSIX-personality. If you're writing a device resource manager, you'll want to use this layer so that you don't have to worry too much about the details involved in presenting a POSIX filesystem to the world.
This layer consists of default handlers that the resource manager library uses if you don't provide a handler. For example, if you don't provide an io_open handler, iofunc_open_default() is called.
It also contains helper functions that the default handlers call. If you override the default handlers with your own, you can still call these helper functions. For example, if you provide your own io_read handler, you can call iofunc_read_verify() at the start of it to make sure that the client has access to the resource.
The names of the functions and structures for this layer have the form iofunc_*. The header file is <sys/iofunc.h>. For more information, see the Library Reference.
This layer manages most of the resource manager library details. It:
If you don't use this layer, then you'll have to parse the messages yourself. Most resource managers use this layer.
The names of the functions and structures for this layer have the form resmgr_*. The header file is <sys/resmgr.h>. For more information, see the Library Reference.

You can use the resmgr layer to handle _IO_* messages.
This layer acts as a single blocking point for a number of different types of things. With this layer, you can handle:

You can use the dispatch layer to handle _IO_* messages, select, pulses, and other messages.
The following describes the manner in which messages are handled via the dispatch layer (or more precisely, through dispatch_handler()). Depending on the blocking type, the handler may call the message_*() subsystem. A search is made, based on the message type or pulse code, for a matching function that was attached using message_attach() or pulse_attach(). If a match is found, the attached function is called.
If the message type is in the range handled by the resource manager (I/O messages) and pathnames were attached using resmgr_attach(), the resource manager subsystem is called and handles the resource manager message.
If a pulse is received, it may be dispatched to the resource manager subsystem if it's one of the codes handled by a resource manager (UNBLOCK and DISCONNECT pulses). If a select_attach() is done and the pulse matches the one used by select, then the select subsystem is called and dispatches that event.
If a message is received and no matching handler is found for that message type, MsgError(ENOSYS) is returned to unblock the sender.
This layer allows you to have a single- or multi-threaded resource manager. This means that one thread can be handling a write() while another thread handles a read().
You provide the blocking function for the threads to use as well as the handler function that's to be called when the blocking function returns. Most often, you give it the dispatch layer's functions. However, you can also give it the resmgr layer's functions or your own.
You can use this layer independently of the resource manager layer.
The following are two complete but simple examples of a device resource manager:
![]() |
As you read through this chapter, you'll encounter many code snippets. Most of these code snippets have been written so that they can be combined with either of these simple resource managers. |
Both of these simple device resource managers model their functionality after that provided by /dev/null:
Here's the complete code for a simple single-threaded device resource manager:
#include <errno.h>
#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/iofunc.h>
#include <sys/dispatch.h>
static resmgr_connect_funcs_t connect_funcs;
static resmgr_io_funcs_t io_funcs;
static iofunc_attr_t attr;
main(int argc, char **argv)
{
/* declare variables we'll be using */
resmgr_attr_t resmgr_attr;
dispatch_t *dpp;
dispatch_context_t *ctp;
int id;
/* initialize dispatch interface */
if((dpp = dispatch_create()) == NULL) {
fprintf(stderr,
"%s: Unable to allocate dispatch handle.\n",
argv[0]);
return EXIT_FAILURE;
}
/* initialize resource manager attributes */
memset(&resmgr_attr, 0, sizeof resmgr_attr);
resmgr_attr.nparts_max = 1;
resmgr_attr.msg_max_size = 2048;
/* initialize functions for handling messages */
iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &io_funcs);
/* initialize attribute structure used by the device */
iofunc_attr_init(&attr, S_IFNAM | 0666, 0, 0);
/* attach our device name */
id = resmgr_attach(
dpp, /* dispatch handle */
&resmgr_attr, /* resource manager attrs */
"/dev/sample", /* device name */
_FTYPE_ANY, /* open type */
0, /* flags */
&connect_funcs, /* connect routines */
&io_funcs, /* I/O routines */
&attr); /* handle */
if(id == -1) {
fprintf(stderr, "%s: Unable to attach name.\n", argv[0]);
return EXIT_FAILURE;
}
/* allocate a context structure */
ctp = dispatch_context_alloc(dpp);
/* start the resource manager message loop */
while(1) {
if((ctp = dispatch_block(ctp)) == NULL) {
fprintf(stderr, "block error\n");
return EXIT_FAILURE;
}
dispatch_handler(ctp);
}
}
![]() |
Include <sys/dispatch.h> after <sys/iofunc.h> to avoid warnings about redefining the members of some functions. |
Let's examine the sample code step-by-step.
Here's an outline of the steps we followed:
/* initialize dispatch interface */
if((dpp = dispatch_create()) == NULL) {
fprintf(stderr, "%s: Unable to allocate dispatch handle.\n",
argv[0]);
return EXIT_FAILURE;
}
We need to set up a mechanism so that clients can send messages to the resource manager. This is done via the dispatch_create() function which creates and returns the dispatch structure. This structure contains the channel ID. Note that the channel ID isn't actually created until you attach something, as in resmgr_attach(), message_attach(), and pulse_attach().
![]() |
The dispatch structure (of type dispatch_t) is opaque; you can't access its contents directly. Use message_connect() to create a connection using this hidden channel ID. |
/* initialize resource manager attributes */ memset(&resmgr_attr, 0, sizeof resmgr_attr); resmgr_attr.nparts_max = 1; resmgr_attr.msg_max_size = 2048;
The resource manager attribute structure is used to configure:
For more information, see resmgr_attach() in the Library Reference.
/* initialize functions for handling messages */
iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &io_funcs);
Here we supply two tables that specify which function to call when a particular message arrives:
Instead of filling in these tables manually, we call iofunc_func_init() to place the iofunc_*_default() handler functions into the appropriate spots.
/* initialize attribute structure used by the device */ iofunc_attr_init(&attr, S_IFNAM | 0666, 0, 0);
The attribute structure contains information about our particular device associated with the name /dev/sample. It contains at least the following information:
Effectively, this is a per-name data structure. Later on, we'll see how you could extend the structure to include your own per-device information.
/* attach our device name */
id = resmgr_attach(dpp, /* dispatch handle */
&resmgr_attr, /* resource manager attrs */
"/dev/sample", /* device name */
_FTYPE_ANY, /* open type */
0, /* flags */
&connect_funcs, /* connect routines */
&io_funcs, /* I/O routines */
&attr); /* handle */
if(id == -1) {
fprintf(stderr, "%s: Unable to attach name.\n", argv[0]);
return EXIT_FAILURE;
}
Before a resource manager can receive messages from other programs, it needs to inform the other programs (via the process manager) that it's the one responsible for a particular pathname prefix. This is done via pathname registration. When registered, other processes can find and connect to this process using the registered name.
In this example, a serial port may be managed by a resource manager called devc-xxx, but the actual resource is registered as /dev/sample in the pathname space. Therefore, when a program requests serial port services, it opens the /dev/sample serial port.
We'll look at the parameters in turn, skipping the ones we've already discussed.
Some resource managers legitimately limit the types of open requests they handle. For instance, the POSIX message queue resource manager accepts only open messages of type _FTYPE_MQUEUE.
/* allocate a context structure */ ctp = dispatch_context_alloc(dpp);
The context structure contains a buffer where messages will be received. The size of the buffer was set when we initialized the resource manager attribute structure. The context structure also contains a buffer of IOVs that the library can use for replying to messages. The number of IOVs was set when we initialized the resource manager attribute structure.
For more information, see dispatch_context_alloc() in the Library Reference.
/* start the resource manager message loop */
while(1) {
if((ctp = dispatch_block(ctp)) == NULL) {
fprintf(stderr, "block error\n");
return EXIT_FAILURE;
}
dispatch_handler(ctp);
}
Once the resource manager establishes its name, it receives messages when any client program tries to perform an operation (e.g. open(), read(), write()) on that name. In our example, once /dev/sample is registered, and a client program executes:
fd = open ("/dev/sample", O_RDONLY);
the client's C library constructs an _IO_CONNECT message which it sends to our resource manager. Our resource manager receives the message within the dispatch_block() function. We then call dispatch_handler() which decodes the message and calls the appropriate handler function based on the connect and I/O function tables that we passed in previously. After dispatch_handler() returns, we go back to the dispatch_block() function to wait for another message.
At some later time, when the client program executes:
read (fd, buf, BUFSIZ);
the client's C library constructs an _IO_READ message, which is then sent directly to our resource manager, and the decoding cycle repeats.
Here's the complete code for a simple multi-threaded device resource manager:
#include <errno.h>
#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>
#include <unistd.h>
/*
* define THREAD_POOL_PARAM_T such that we can avoid a compiler
* warning when we use the dispatch_*() functions below
*/
#define THREAD_POOL_PARAM_T dispatch_context_t
#include <sys/iofunc.h>
#include <sys/dispatch.h>
static resmgr_connect_funcs_t connect_funcs;
static resmgr_io_funcs_t io_funcs;
static iofunc_attr_t attr;
main(int argc, char **argv)
{
/* declare variables we'll be using */
thread_pool_attr_t pool_attr;
resmgr_attr_t resmgr_attr;
dispatch_t *dpp;
thread_pool_t *tpp;
dispatch_context_t *ctp;
int id;
/* initialize dispatch interface */
if((dpp = dispatch_create()) == NULL) {
fprintf(stderr, "%s: Unable to allocate dispatch handle.\n",
argv[0]);
return EXIT_FAILURE;
}
/* initialize resource manager attributes */
memset(&resmgr_attr, 0, sizeof resmgr_attr);
resmgr_attr.nparts_max = 1;
resmgr_attr.msg_max_size = 2048;
/* initialize functions for handling messages */
iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &io_funcs);
/* initialize attribute structure used by the device */
iofunc_attr_init(&attr, S_IFNAM | 0666, 0, 0);
/* attach our device name */
id = resmgr_attach(dpp, /* dispatch handle */
&resmgr_attr, /* resource manager attrs */
"/dev/sample", /* device name */
_FTYPE_ANY, /* open type */
0, /* flags */
&connect_funcs, /* connect routines */
&io_funcs, /* I/O routines */
&attr); /* handle */
if(id == -1) {
fprintf(stderr, "%s: Unable to attach name.\n", argv[0]);
return EXIT_FAILURE;
}
/* initialize thread pool attributes */
memset(&pool_attr, 0, sizeof pool_attr);
pool_attr.handle = dpp;
pool_attr.context_alloc = dispatch_context_alloc;
pool_attr.block_func = dispatch_block;
pool_attr.unblock_func = dispatch_unblock;
pool_attr.handler_func = dispatch_handler;
pool_attr.context_free = dispatch_context_free;
pool_attr.lo_water = 2;
pool_attr.hi_water = 4;
pool_attr.increment = 1;
pool_attr.maximum = 50;
/* allocate a thread pool handle */
if((tpp = thread_pool_create(&pool_attr,
POOL_FLAG_EXIT_SELF)) == NULL) {
fprintf(stderr, "%s: Unable to initialize thread pool.\n",
argv[0]);
return EXIT_FAILURE;
}
/* start the threads, will not return */
thread_pool_start(tpp);
}
Most of the code is the same as in the single-threaded example, so we will cover only those parts that not are described above. Also, we'll go into more detail on multi-threaded resource managers later in this chapter, so we'll keep the details here to a minimum.
Here's an outline of the steps we'll cover:For this code sample, the threads are using the dispatch_*() functions (i.e. the dispatch layer) for their blocking loops.
/* * define THREAD_POOL_PARAM_T such that we can avoid a compiler * warning when we use the dispatch_*() functions below */ #define THREAD_POOL_PARAM_T dispatch_context_t #include <sys/iofunc.h> #include <sys/dispatch.h>
The THREAD_POOL_PARAM_T manifest tells the compiler what type of parameter is passed between the various blocking/handling functions that the threads will be using. This parameter should be the context structure used for passing context information between the functions. By default it is defined as a resmgr_context_t but since this sample is using the dispatch layer, we need it to be a dispatch_context_t. We define it prior to doing the includes above since the header files refer to it.
/* initialize thread pool attributes */ memset(&pool_attr, 0, sizeof pool_attr); pool_attr.handle = dpp; pool_attr.context_alloc = dispatch_context_alloc; pool_attr.block_func = dispatch_block; pool_attr.unblock_func = dispatch_unblock; pool_attr.handler_func = dispatch_handler; pool_attr.context_free = dispatch_context_free; pool_attr.lo_water = 2; pool_attr.hi_water = 4; pool_attr.increment = 1; pool_attr.maximum = 50;
The thread pool attributes tell the threads which functions to use for their blocking loop and control how many threads should be in existence at any time. We go into more detail on these attributes when we talk about multi-threaded resource managers in more detail later in this chapter.
/* allocate a thread pool handle */
if((tpp = thread_pool_create(&pool_attr,
POOL_FLAG_EXIT_SELF)) == NULL) {
fprintf(stderr, "%s: Unable to initialize thread pool.\n",
argv[0]);
return EXIT_FAILURE;
}
The thread pool handle is used to control the thread pool. Amongst other things, it contains the given attributes and flags. The thread_pool_create() function allocates and fills in this handle.
/* start the threads, will not return */ thread_pool_start(tpp);
The thread_pool_start() function starts up the thread pool. Each newly created thread allocates a context structure of the type defined by THREAD_POOL_PARAM_T using the context_alloc function we gave above in the attribute structure. They'll then block on the block_func and when the block_func returns, they'll call the handler_func, both of which were also given through the attributes structure. Each thread essentially does the same thing that the single-threaded resource manager above does for its message loop. THREAD_POOL_PARAM_T
From this point on, your resource manager is ready to handle messages. Since we gave the POOL_FLAG_EXIT_SELF flag to thread_pool_create(), once the threads have been started up, pthread_exit() will be called and this calling thread will exit.
The resource manager library defines several key structures for carrying data:
This picture may help explain their interrelationships:

Multiple clients with multiple OCBs, all linked to one mount structure.
The Open Control Block (OCB) maintains the state information about a particular session involving a client and a resource manager. It's created during open handling and exists until a close is performed.
This structure is used by the iofunc layer helper functions. (Later on, we'll show you how to extend this to include your own data).
The OCB structure contains at least the following:
typedef struct _iofunc_ocb {
IOFUNC_ATTR_T *attr;
int32_t ioflag;
off_t offset;
uint16_t sflag;
uint16_t flags;
} iofunc_ocb_t;
where the values represent:
The iofunc_attr_t structure defines the characteristics of the device that you're supplying the resource manager for. This is used in conjunction with the OCB structure.
The attribute structure contains at least the following:
typedef struct _iofunc_attr {
IOFUNC_MOUNT_T *mount;
uint32_t flags;
int32_t lock_tid;
uint16_t lock_count;
uint16_t count;
uint16_t rcount;
uint16_t wcount;
uint16_t rlocks;
uint16_t wlocks;
struct _iofunc_mmap_list *mmap_list;
struct _iofunc_lock_list *lock_list;
void *list;
uint32_t list_size;
off_t nbytes;
ino_t inode;
uid_t uid;
gid_t gid;
time_t mtime;
time_t atime;
time_t ctime;
mode_t mode;
nlink_t nlink;
dev_t rdev;
} iofunc_attr_t;
where the values represent:
Since your resource manager uses these flags, you can tell right away which fields of the attribute structure have been modified by the various iofunc-layer helper routines. That way, if you need to write the entries to some medium, you can write just those that have changed. The user-defined area for flags is IOFUNC_ATTR_PRIVATE (see <sys/iofunc.h>).
For details on updating your attribute structure, see the section on "Updating the time for reads and writes" below.
| This counter: | tracks the number of: |
|---|---|
| count | OCBs using this attribute in any manner. When this count goes to zero, it means that no one is using this attribute. |
| rcount | OCBs using this attribute for reading. |
| wcount | OCBs using this attribute for writing. |
| rlocks | read locks currently registered on the attribute. |
| wlocks | write locks currently registered on the attribute. |
These counts aren't exclusive. For example, if an OCB has specified that the resource is opened for reading and writing, then count, rcount, and wcount will all be incremented. (See the iofunc_attr_init(), iofunc_lock_default(), iofunc_lock(), iofunc_ocb_attach(), and iofunc_ocb_detach() functions.)
![]() |
One or more of the three time members may be invalidated as a result of calling an iofunc-layer function. This is to avoid having each and every I/O message handler go to the kernel and request the current time of day, just to fill in the attribute structure's time member(s). |
POSIX states that these times must be valid when the fstat() is performed, but they don't have to reflect the actual time that the associated change occurred. Also, the times must change between fstat() invocations if the associated change occurred between fstat() invocations. If the associated change never occurred between fstat() invocations, then the time returned should be the same as returned last time. Furthermore, if the associated change occurred multiple times between fstat() invocations, then the time need only be different from the previously returned time.
There's a helper function that fills the members with the correct time; you may wish to call it in the appropriate handlers to keep the time up-to-date on the device -- see the iofunc_time_update() function.
The members of the mount structure, specifically the conf and flags members, modify the behavior of some of the iofunc layer functions. This optional structure contains at least the following:
typedef struct _iofunc_mount {
uint32_t flags;
uint32_t conf;
dev_t dev;
int32_t blocksize;
iofunc_funcs_t *funcs;
} iofunc_mount_t;
The variables are:
Note that the options mentioned above for the conf member are returned by the iofunc layer _IO_PATHCONF default handler.
struct _iofunc_funcs {
unsigned nfuncs;
IOFUNC_OCB_T *(*ocb_calloc) (resmgr_context_t *ctp,
IOFUNC_ATTR_T *attr);
void (*ocb_free) (IOFUNC_OCB_T *ocb);
};
where:
The io_read handler is responsible for returning data bytes to the client after receiving an _IO_READ message. Examples of functions that send this message are read(), readdir(), fread(), and fgetc(). Let's start by looking at the format of the message itself:
struct _io_read {
uint16_t type;
uint16_t combine_len;
int32_t nbytes;
uint32_t xtype;
};
typedef union {
struct _io_read i;
/* unsigned char data[nbytes]; */
/* nbytes is returned with MsgReply */
} io_read_t;
As with all resource manager messages, we've defined union that contains the input (coming into the resource manager) structure and a reply or output (going back to the client) structure. The io_read() function is prototyped with an argument of io_read_t *msg -- that's the pointer to the union containing the message.
Since this is a read(), the type member has the value _IO_READ. The items of interest in the input structure are:
We'll create an io_read() function that will serve as our handler that actually returns some data (the fixed string "Hello, world\n"). We'll use the OCB to keep track of our position within the buffer that we're returning to the client.
When we get the _IO_READ message, the nbytes member tells us exactly how many bytes the client wants to read. Suppose that the client issues:
read (fd, buf, 4096);
In this case, it's a simple matter to return our entire "Hello, world\n" string in the output buffer and tell the client that we're returning 13 bytes, i.e. the size of the string.
However, consider the case where the client is performing the following:
while (read (fd, &character, 1) != EOF) {
printf ("Got a character \"%c\"\n", character);
}
Granted, this isn't a terribly efficient way for the client to perform reads! In this case, we would get msg->i.nbytes set to 1 (the size of the buffer that the client wants to get). We can't simply return the entire string all at once to the client -- we have to hand it out one character at a time. This is where the OCB's offset member comes into play.
Here's a complete io_read() function that correctly handles these cases:
#include <errno.h>
#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/iofunc.h>
#include <sys/dispatch.h>
int io_read (resmgr_context_t *ctp, io_read_t *msg, RESMGR_OCB_T *ocb);
static char *buffer = "Hello world\n";
static resmgr_connect_funcs_t connect_funcs;
static resmgr_io_funcs_t io_funcs;
static iofunc_attr_t attr;
main(int argc, char **argv)
{
/* declare variables we'll be using */
resmgr_attr_t resmgr_attr;
dispatch_t *dpp;
dispatch_context_t *ctp;
int id;
/* initialize dispatch interface */
if((dpp = dispatch_create()) == NULL) {
fprintf(stderr, "%s: Unable to allocate dispatch handle.\n",
argv[0]);
return EXIT_FAILURE;
}
/* initialize resource manager attributes */
memset(&resmgr_attr, 0, sizeof resmgr_attr);
resmgr_attr.nparts_max = 1;
resmgr_attr.msg_max_size = 2048;
/* initialize functions for handling messages */
iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &io_funcs);
io_funcs.read = io_read;
/* initialize attribute structure used by the device */
iofunc_attr_init(&attr, S_IFNAM | 0666, 0, 0);
attr.nbytes = strlen(buffer)+1;
/* attach our device name */
if((id = resmgr_attach(dpp, &resmgr_attr, "/dev/sample", _FTYPE_ANY, 0,
&connect_funcs, &io_funcs, &attr)) == -1) {
fprintf(stderr, "%s: Unable to attach name.\n", argv[0]);
return EXIT_FAILURE;
}
/* allocate a context structure */
ctp = dispatch_context_alloc(dpp);
/* start the resource manager message loop */
while(1) {
if((ctp = dispatch_block(ctp)) == NULL) {
fprintf(stderr, "block error\n");
return EXIT_FAILURE;
}
dispatch_handler(ctp);
}
}
int
io_read (resmgr_context_t *ctp, io_read_t *msg, RESMGR_OCB_T *ocb)
{
int nleft;
int nbytes;
int nparts;
int status;
if ((status = iofunc_read_verify (ctp, msg, ocb, NULL)) != EOK)
return (status);
if ((msg->i.xtype & _IO_XTYPE_MASK) != _IO_XTYPE_NONE)
return (ENOSYS);
/*
* On all reads (first and subsequent), calculate
* how many bytes we can return to the client,
* based upon the number of bytes available (nleft)
* and the client's buffer size
*/
nleft = ocb->attr->nbytes - ocb->offset;
nbytes = min (msg->i.nbytes, nleft);
if (nbytes > 0) {
/* set up the return data IOV */
SETIOV (ctp->iov, buffer + ocb->offset, nbytes);
/* set up the number of bytes (returned by client's read()) */
_IO_SET_READ_NBYTES (ctp, nbytes);
/*
* advance the offset by the number of bytes
* returned to the client.
*/
ocb->offset += nbytes;
nparts = 1;
} else {
/*
* they've asked for zero bytes or they've already previously
* read everything
*/
_IO_SET_READ_NBYTES (ctp, 0);
nparts = 0;
}
/* mark the access time as invalid (we just accessed it) */
if (msg->i.nbytes > 0)
ocb->attr->flags |= IOFUNC_ATTR_ATIME;
return (_RESMGR_NPARTS (nparts));
}
The ocb maintains our context for us by storing the offset field, which gives us the position within the buffer, and by having a pointer to the attribute structure attr, which tells us how big the buffer actually is via its nbytes member.
Of course, we had to give the resource manager library the address of our io_read() handler function so that it knew to call it. So the code in main() where we had called iofunc_func_init() became:
/* initialize functions for handling messages */
iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &io_funcs);
io_funcs.read = io_read;
We also needed to add the following to the area above main():
#include <errno.h>
#include <unistd.h>
int io_read (resmgr_context_t *ctp, io_read_t *msg, RESMGR_OCB_T *ocb);
static char *buffer = "Hello world\n";"
Where did the attribute structure's nbytes member get filled in? In main(), just after we did the iofunc_attr_init(). We modified main() slightly:
After this line:
iofunc_attr_init (&attr, S_IFNAM | 0666, 0, 0);
We added this one:
attr.nbytes = strlen (buffer)+1;
At this point, if you were to run the resource manager (our simple resource manager used the name /dev/sample), you could do:
# cat /dev/sample Hello, world
The return line (_RESMGR_NPARTS(nparts)) tells the resource manager library to:
Where does it get the IOV array? It's using ctp->iov. That's why we first used the SETIOV() macro to make ctp->iov point to the data to reply with.
If we had no data, as would be the case of a read of zero bytes, then we'd do a return (_RESMGR_NPARTS(0)). But read() returns with the number of bytes successfully read. Where did we give it this information? That's what the _IO_SET_READ_NBYTES() macro was for. It takes the nbytes that we give it and stores it in the context structure (ctp). Then when we return to the library, the library takes this nbytes and passes it as the second parameter to the MsgReplyv(). The second parameter tells the kernel what the MsgSend() should return. And since the read() function is calling MsgSend(), that's where it finds out how many bytes were read.
We also update the access time for this device in the read handler. For details on updating the access time, see the section on "Updating the time for reads and writes" below.
You can add functionality to the resource manager you're writing in these fundamental ways:
The first two are almost identical, because the default functions really don't do that much by themselves -- they rely on the POSIX helper functions. The third approach has advantages and disadvantages.
Since the default functions (e.g. iofunc_open_default()) can be installed in the jump table directly, there's no reason you couldn't embed them within your own functions.
Here's an example of how you would do that with your own io_open() handler:
main (int argc, char **argv)
{
...
/* install all of the default functions */
iofunc_func_init (_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &io_funcs);
/* take over the open function */
connect_funcs.open = io_open;
...
}
int
io_open (resmgr_context_t *ctp, io_open_t *msg,
RESMGR_HANDLE_T *handle, void *extra)
{
return (iofunc_open_default (ctp, msg, handle, extra));
}
Obviously, this is just an incremental step that lets you gain control in your io_open() when the message arrives from the client. You may wish to do something before or after the default function does its thing:
/* example of doing something before */
extern int accepting_opens_now;
int
io_open (resmgr_context_t *ctp, io_open_t *msg,
RESMGR_HANDLE_T *handle, void *extra)
{
if (!accepting_opens_now) {
return (EBUSY);
}
/*
* at this point, we're okay to let the open happen,
* so let the default function do the "work".
*/
return (iofunc_open_default (ctp, msg, handle, extra));
}
Or:
/* example of doing something after */
int
io_open (resmgr_context_t *ctp, io_open_t *msg,
RESMGR_HANDLE_T *handle, void *extra)
{
int sts;
/*
* have the default function do the checking
* and the work for us
*/
sts = iofunc_open_default (ctp, msg, handle, extra);
/*
* if the default function says it's okay to let the open
* happen, we want to log the request
*/
if (sts == EOK) {
log_open_request (ctp, msg);
}
return (sts);
}
It goes without saying that you can do something before and after the standard default POSIX handler.
The principal advantage of this approach is that you can add to the functionality of the standard default POSIX handlers with very little effort.
The default functions make use of helper functions -- these functions can't be placed directly into the connect or I/O jump tables, but they do perform the bulk of the work.
Here's the source for the two functions iofunc_chmod_default() and iofunc_stat_default():
int
iofunc_chmod_default (resmgr_context_t *ctp, io_chmod_t *msg,
iofunc_ocb_t *ocb)
{
return (iofunc_chmod (ctp, msg, ocb, ocb -> attr));
}
int
iofunc_stat_default (resmgr_context_t *ctp, io_stat_t *msg,
iofunc_ocb_t *ocb)
{
iofunc_time_update (ocb -> attr);
iofunc_stat (ocb -> attr, &msg -> o);
return (_RESMGR_PTR (ctp, &msg -> o,
sizeof (msg -> o)));
}
Notice how the iofunc_chmod() handler performs all the work for the iofunc_chmod_default() default handler. This is typical for the simple functions.
The more interesting case is the iofunc_stat_default() default handler, which calls two helper routines. First it calls iofunc_time_update() to ensure that all of the time fields (atime, ctime and mtime) are up to date. Then it calls iofunc_stat(), which builds the reply. Finally, the default function builds a pointer in the ctp structure and returns it.
The most complicated handling is done by the iofunc_open_default() handler:
int
iofunc_open_default (resmgr_context_t *ctp, io_open_t *msg,
iofunc_attr_t *attr, void *extra)
{
int status;
iofunc_attr_lock (attr);
if ((status = iofunc_open (ctp, msg, attr, 0, 0)) != EOK) {
iofunc_attr_unlock (attr);
return (status);
}
if ((status = iofunc_ocb_attach (ctp, msg, 0, attr, 0))
!= EOK) {
iofunc_attr_unlock (attr);
return (status);
}
iofunc_attr_unlock (attr);
return (EOK);
}
This handler calls four helper functions:
Sometimes a default function will be of no help for your particular resource manager. For example, iofunc_read_default() and iofunc_write_default() functions implement /dev/null -- they do all the work of returning 0 bytes (EOF) or swallowing all the message bytes (respectively).
You'll want to do something in those handlers (unless your resource manager doesn't support the _IO_READ or _IO_WRITE messages).
Note that even in such cases, there are still helper functions you can use: iofunc_read_verify() and iofunc_write_verify().
The io_write handler is responsible for writing data bytes to the media after receiving a client's _IO_WRITE message. Examples of functions that send this message are write() and fflush(). Here's the message:
struct _io_write {
uint16_t type;
uint16_t combine_len;
int32_t nbytes;
uint32_t xtype;
/* unsigned char data[nbytes]; */
};
typedef union {
struct _io_write i;
/* nbytes is returned with MsgReply */
} io_write_t;
As with the io_read_t, we have a union of an input and an output message, with the output message being empty (the number of bytes actually written is returned by the resource manager library directly to the client's MsgSend()).
The data being written by the client almost always follows the header message stored in struct _io_write. The exception is if the write was done using pwrite() or pwrite64(). More on this when we discuss the xtype member.
To access the data, we recommend that you reread it into your own buffer. Let's say you had a buffer called inbuf that was "big enough" to hold all the data you expected to read from the client (if it isn't big enough, you'll have to read the data piecemeal).
The following is a code snippet that can be added to one of the simple resource manager examples. It prints out whatever it's given (making the assumption that it's given only character text):
int
io_write (resmgr_context_t *ctp, io_write_t *msg, RESMGR_OCB_T *ocb)
{
int status;
char *buf;
if ((status = iofunc_write_verify(ctp, msg, ocb, NULL)) != EOK)
return (status);
if ((msg->i.xtype & _IO_XTYPE_MASK) != _IO_XTYPE_NONE)
return(ENOSYS);
/* set up the number of bytes (returned by client's write()) */
_IO_SET_WRITE_NBYTES (ctp, msg->i.nbytes);
buf = (char *) malloc(msg->i.nbytes + 1);
if (buf == NULL)
return(ENOMEM);
/*
* Reread the data from the sender's message buffer.
* We're not assuming that all of the data fit into the
* resource manager library's receive buffer.
*/
resmgr_msgread(ctp, buf, msg->i.nbytes, sizeof(msg->i));
buf [msg->i.nbytes] = '\0'; /* just in case the text is not NULL terminated */
printf ("Received %d bytes = '%s'\n", msg -> i.nbytes, buf);
free(buf);
if (msg->i.nbytes > 0)
ocb->attr->flags |= IOFUNC_ATTR_MTIME | IOFUNC_ATTR_CTIME;
return (_RESMGR_NPARTS (0));
}
Of course, we'll have to give the resource manager library the address of our io_write handler so that it'll know to call it. In the code for main() where we called iofunc_func_init(), we'll add a line to register our io_write handler:
/* initialize functions for handling messages */
iofunc_func_init(_RESMGR_CONNECT_NFUNCS, &connect_funcs,
_RESMGR_IO_NFUNCS, &io_funcs);
io_funcs.write = io_write;
You may also need to add the following prototype:
int io_write (resmgr_context_t *ctp, io_write_t *msg,
RESMGR_OCB_T *ocb);
At this point, if you were to run the resource manager (our simple resource manager used the name /dev/sample), you could write to it by doing echo Hello > /dev/sample as follows:
# echo Hello > /dev/sample Received 6 bytes = 'Hello'
Notice how we passed the last argument to resmgr_msgread() (the offset argument) as the size of the input message buffer. This effectively skips over the header and gets to the data component.
If the buffer you supplied wasn't big enough to contain the entire message from the client (e.g. you had a 4 KB buffer and the client wanted to write 1 megabyte), you'd have to read the buffer in stages, using a for loop, advancing the offset passed to resmgr_msgread() by the amount read each time.
Unlike the io_read handler sample, this time we didn't do anything with ocb->offset. In this case there's no reason to. The ocb->offset would make more sense if we were managing things that had advancing positions such as a file position.
The reply is simpler than with the io_read handler, since a write() call doesn't expect any data back. Instead, it just wants to know if the write succeeded and if so, how many bytes were written. To tell it how many bytes were written we used the _IO_SET_WRITE_NBYTES() macro. It takes the nbytes that we give it and stores it in the context structure (ctp). Then when we return to the library, the library takes this nbytes and passes it as the second parameter to the MsgReplyv(). The second parameter tells the kernel what the MsgSend() should return. And since the write() function is calling MsgSend(), that's where it finds out how many bytes were written.
Since we're writing to the device, we should also update the modification, and potentially, the creation time. For details on updating the modification and change of file status times, see the section on "Updating the time for reads and writes" below.
You can return to the resource manager library from your handler functions in various ways. This is complicated by the fact that the resource manager library can reply for you if you want it to, but you must tell it to do so and put the information that it'll use in all the right places.
In this section, we'll discuss the following ways of returning to the resource manager library:
To reply to the client such that the function the client is calling (e.g. read()) will return with an error, you simply return with an appropriate errno value (from <errno.h>).
return (ENOMEM);
In the case of a read(), this causes the read to return -1 with errno set to ENOMEM.
Sometimes you'll want to reply with a header followed by one of N buffers, where the buffer used will differ each time you reply. To do this, you can set up an IOV array whose elements point to the header and to a buffer.
The context structure already has an IOV array. If you want the resource manager library to do your reply for you, then you must use this array. But the array must contain enough elements for your needs. To ensure that this is the case, you'd set the nparts_max member of the resmgr_attr_t structure that you passed to resmgr_attach() when you registered your name in the pathname space.
The following example assumes that the variable i contains the offset into the array of buffers of the desired buffer to reply with. The 2 in _RESMGR_NPARTS(2) tells the library how many elements in ctp->iov to reply with.
my_header_t header; a_buffer_t buffers[N]; ... SETIOV(&ctp->iov[0], &header, sizeof(header)); SETIOV(&ctp->iov[1], &buffers[i], sizeof(buffers[i])); return (_RESMGR_NPARTS(2));
An example of this would be replying to a read() where all the data existed in a single buffer. You'll typically see this done in two ways:
return (_RESMGR_PTR(ctp, buffer, nbytes));
And:
SETIOV (ctp->iov, buffer, nbytes); return (_RESMGR_NPARTS(1));
The first method, using the _RESMGR_PTR() macro, is just a convenience for the second method where a single IOV is returned.
This can be done in a few ways. The most simple would be:
return (EOK);
But you'll often see:
return (_RESMGR_NPARTS(0));
Note that in neither case are you causing the MsgSend() to return with a 0. The value that the MsgSend() returns is the value passed to the _IO_SET_READ_NBYTES(), _IO_SET_WRITE_NBYTES(), and other similar macros. These two were used in the read and write samples above.
In this case, you give the client the data and get the resource manager library to do the reply for you. However, the reply data won't be valid by that time. For example, if the reply data was in a buffer that you wanted to free before returning, you could use the following:
resmgr_msgwrite (ctp, buffer, nbytes, 0); free (buffer); return (EOK);
The resmgr_msgwrite() copies the contents of buffer into the client's reply buffer immediately. Note that a reply is still required in order to unblock the client so it can examine the data. Next we free the buffer. Finally, we return to the resource manager library such that it does a reply with zero-length data. Since the reply is of zero length, it doesn't overwrite the data already written into the client's reply buffer. When the client returns from its send call, the data is there waiting for it.
In all of the previous examples, it's the resource manager library that calls MsgReply*() or MsgError() to unblock the client. In some cases, you may not want the library to reply for you. For instance, you might have already done the reply yourself, or you'll reply later. In either case, you'd return as follows:
return (_RESMGR_NOREPLY);
An example of a resource manager that would reply to clients later is a pipe resource manager. If the client is doing a read of your pipe but you have no data for the client, then you have a choice:
Or:
Another example might be if the client wants you to write out to some device but doesn't want to get a reply until the data has been fully written out. Here are the sequence of events that might follow:
The first issue, though, is whether the client wants to be left blocked. If the client doesn't want to be left blocked, then it opens with the O_NONBLOCK flag:
fd = open("/dev/sample", O_RDWR | O_NONBLOCK);
The default is to allow you to block it.
One of the first things done in the read and write samples above was to call some POSIX verification functions: iofunc_read_verify() and iofunc_write_verify(). If we pass the address of an int as the last parameter, then on return the functions will stuff that int with nonzero if the client doesn't want to be blocked (O_NONBLOCK flag was set) or with zero if the client wants to be blocked.
int nonblock;
if ((status = iofunc_read_verify (ctp, msg, ocb,
&nonblock)) != EOK)
return (status);
...
int nonblock;
if ((status = iofunc_write_verify (ctp, msg, ocb,
&nonblock)) != EOK)
return (status);
When it then comes time to decide if we should reply with an error or reply later, we do:
if (nonblock) {
/* client doesn't want to be blocked */
return (EAGAIN);
} else {
/*
* The client is willing to be blocked.
* Save at least the ctp->rcvid so that you can
* reply to it later.
*/
...
return (_RESMGR_NOREPLY);
}
The question remains: How do you do the reply yourself? The only detail to be aware of is that the rcvid to reply to is ctp->rcvid. If you're replying later, then you'd save ctp->rcvid and use the saved value in your reply.
MsgReply(saved_rcvid, 0, buffer, nbytes);
Or:
iov_t iov[2]; SETIOV(&iov[0], &header, sizeof(header)); SETIOV(&iov[1], &buffers[i], sizeof(buffers[i])); MsgReplyv(saved_rcvid, 0, iov, 2);
Note that you can fill up the client's reply buffer as data becomes available by using resmgr_msgwrite() and resmgr_msgwritev(). Just remember to do the MsgReply*() at some time to unblock the client.
![]() |
If you're replying to an _IO_READ or _IO_WRITE message, the status argument for MsgReply*() must be the number of bytes read or written. |
The default action in most cases is for the library to cause the client's function to fail with ENOSYS:
return (_RESMGR_DEFAULT);
Topics in this session include:
The io_read, io_write, and io_openfd message structures contain a member called xtype. From struct _io_read:
struct _io_read {
...
uint32_t xtype;
...
}
Basically, the xtype contains extended type information that can be used to adjust the behavior of a standard I/O function. Most resource managers care about only a few values:
For example:
struct myread_offset {
struct _io_read read;
struct _xtype_offset offset;
}
Some resource managers can be sure that their clients will never call pread*() or pwrite*(). (For example, a resource manager that's controlling a robot arm probably wouldn't care.) In this case, you can treat this type of message as an error.
struct myreadcond {
struct _io_read read;
struct _xtype_readcond cond;
}
As with _IO_XTYPE_OFFSET, if your resource manager isn't prepared to handle readcond(), you can treat this type of message as an error.
The following code sample demonstrates how to handle the case where you're not expecting any extended types. In this case, if you get a message that contains an xtype, you should reply with ENOSYS. The example can be used in either an io_read or io_write handler.
int
io_read (resmgr_context_t *ctp, io_read_t *msg,
RESMGR_OCB_T *ocb)
{
int status;
if ((status = iofunc_read_verify(ctp, msg, ocb, NULL))
!= EOK) {
return (status);
}
/* No special xtypes */
if ((msg->i.xtype & _IO_XTYPE_MASK) != _IO_XTYPE_NONE)
return (ENOSYS);
...
}
Here are code examples that demonstrate how to handle an _IO_READ or _IO_WRITE message when a client calls:
The following sample code demonstrates how to handle _IO_READ for the case where the client calls one of the pread*() functions.
/* we are defining io_pread_t here to make the code below
simple */
typedef struct {
struct _io_read read;
struct _xtype_offset offset;
} io_pread_t;
int
io_read (resmgr_context_t *ctp, io_read_t *msg,
RESMGR_OCB_T *ocb)
{
off64_t offset; /* where to read from */
int status;
if ((status = iofunc_read_verify(ctp, msg, ocb, NULL))
!= EOK) {
return(status);
}
switch(msg->i.xtype & _IO_XTYPE_MASK) {
case _IO_XTYPE_NONE:
offset = ocb->offset;
break;
case _IO_XTYPE_OFFSET:
/*
* io_pread_t is defined above.
* Client is doing a one-shot read to this offset by
* calling one of the pread*() functions
*/
offset = ((io_pread_t *) msg)->offset.offset;
break;
default:
return(ENOSYS);
}
...
}
The following sample code demonstrates how to handle _IO_WRITE for the case where the client calls one of the pwrite*() functions. Keep in mind that the struct _xtype_offset information follows the struct _io_write in the sender's message buffer. This means that the data to be written follows the struct _xtype_offset information (instead of the normal case where it follows the struct _io_write). So, you must take this into account when doing the resmgr_msgread() call in order to get the data from the sender's message buffer.
/* we are defining io_pwrite_t here to make the code below
simple */
typedef struct {
struct _io_write write;
struct _xtype_offset offset;
} io_pwrite_t;
int
io_write (resmgr_context_t *ctp, io_write_t *msg,
RESMGR_OCB_T *ocb)
{
off64_t offset; /* where to write */
int status;
size_t skip; /* offset into msg to where the data
resides */
if ((status = iofunc_write_verify(ctp, msg, ocb, NULL))
!= EOK) {
return(status);
}
switch(msg->i.xtype & _IO_XTYPE_MASK) {
case _IO_XTYPE_NONE:
offset = ocb->offset;
skip = sizeof(io_write_t);
break;
case _IO_XTYPE_OFFSET:
/*
* io_pwrite_t is defined above
* client is doing a one-shot write to this offset by
* calling one of the pwrite*() functions
*/
offset = ((io_pwrite_t *) msg)->offset.offset;
skip = sizeof(io_pwrite_t);
break;
default:
return(ENOSYS);
}
...
/*
* get the data from the sender's message buffer,
* skipping all possible header information
*/
resmgr_msgreadv(ctp, iovs, niovs, skip);
...
}
The same type of operation that was done to handle the pread()/_IO_XTYPE_OFFSET case can be used for handling the client's readcond() call:
typedef struct {
struct _io_read read;
struct _xtype_readcond cond;
} io_readcond_t
Then:
struct _xtype_readcond *cond
...
CASE _IO_XTYPE_READCOND:
cond = &((io_readcond_t *)msg)->cond
break;
}
Then your manager has to properly interpret and deal with the arguments to readcond(). For more information, see the Library Reference.
In the read sample above we did:
if (msg->i.nbytes > 0)
ocb->attr->flags |= IOFUNC_ATTR_ATIME;
According to POSIX, if the read succeeds and the reader had asked for more than zero bytes, then the access time must be marked for update. But POSIX doesn't say that it must be updated right away. If you're doing many reads, you may not want to read the time from the kernel for every read. In the code above, we mark the time only as needing to be updated. When the next _IO_STAT or _IO_CLOSE_OCB message is processed, the resource manager library will see that the time needs to be updated and will get it from the kernel then. This of course has the disadvantage that the time is not the time of the read.
Similarly for the write sample above, we did:
if (msg->i.nbytes > 0)
ocb->attr->flags |= IOFUNC_ATTR_MTIME | IOFUNC_ATTR_CTIME;
so the same thing will happen.
If you do want to have the times represent the read or write times, then after setting the flags you need only call the iofunc_time_update() helper function. So the read lines become:
if (msg->i.nbytes > 0) {
ocb->attr->flags |= IOFUNC_ATTR_ATIME;
iofunc_time_update(ocb->attr);
}
and the write lines become:
if (msg->i.nbytes > 0) {
ocb->attr->flags |= IOFUNC_ATTR_MTIME | IOFUNC_ATTR_CTIME;
iofunc_time_update(ocb->attr);
}
You should call iofunc_time_update() before you flush out any cached attributes. As a result of changing the time fields, the attribute structure will have the IOFUNC_ATTR_DIRTY_TIME bit set in the flags field, indicating that this field of the attribute must be updated when the attribute is flushed from the cache.
In this section:
In order to conserve network bandwidth and to provide support for atomic operations, combine messages are supported. A combine message is constructed by the client's C library and consists of a number of I/O and/or connect messages packaged together into one. Let's see how they're used.
Consider a case where two threads are executing the following code, trying to read from the same file descriptor:
a_thread ()
{
char buf [BUFSIZ];
lseek (fd, position, SEEK_SET);
read (fd, buf, BUFSIZ);
...
}
The first thread performs the lseek() and then gets preempted by the second thre