Networked message passing differences

So, once the connection is established, all further messaging flows using step 4 in the diagram above. This may lead you to the erroneous belief that message passing over a network is identical to message passing in the local case. Unfortunately, this is not true.

Here are the differences:

longer delays
ConnectAttach() returns success regardless of whether the node is alive or not—the real error indication happens on the first message pass
MsgDeliverEvent() isn't guaranteed reliable
MsgReply(), MsgRead(), MsgWrite() are now blocking calls, whereas in the local case they are not
MsgReceive() might not receive all the data sent by the client; the server might need to call MsgRead() to get the rest

Longer delays

Since message passing is now being done over some medium, rather than a direct kernel-controlled memory-to-memory copy, you can expect that the amount of time taken to transfer messages will be significantly higher (100 MB Ethernet versus 100 MHz 64-bit wide DRAM is going to be an order of magnitude or two slower). Plus, on top of this will be protocol overhead (minimal) and retries on lossy networks.

Impact on ConnectAttach()

When you call ConnectAttach(), you're specifying an ND, a PID, and a CHID. All that happens in QNX Neutrino is that the kernel returns a connection ID to the Qnet “network handler” thread pictured in the diagram above. Since no message has been sent, you're not informed as to whether the node that you've just attached to is still alive or not. In normal use, this isn't a problem, because most clients won't be doing their own ConnectAttach()—rather, they'll be using the services of the library call open(), which does the ConnectAttach() and then almost immediately sends out an “open” message. This has the effect of indicating almost immediately if the remote node is alive or not.

Impact on MsgDeliverEvent()

When a server calls MsgDeliverEvent() locally, it's the kernel's responsibility to deliver the event to the target thread. With the network, the server still calls MsgDeliverEvent(), but the kernel delivers a “proxy” of that event to Qnet, and it's up to Qnet to deliver the proxy to the other (client-side) Qnet, who'll then deliver the actual event to the client. Things can get screwed up on the server side, because the MsgDeliverEvent() function call is non-blocking—this means that once the server has called MsgDeliverEvent(), it's running. It's too late to turn around and say, “I hate to tell you this, but you know that MsgDeliverEvent() that I said succeeded? Well, it didn't!”

Impact on MsgReply(), MsgRead(), and MsgWrite()

To prevent the problem I just mentioned with MsgDeliverEvent() from happening with MsgReply(), MsgRead(), and MsgWrite(), these functions were transformed into blocking calls when used over the network. Locally they'd simply transfer the data and unblock immediately. On the network, we have to (in the case of MsgReply()) ensure that the data has been delivered to the client or (in the case of the other two) to actually transfer the data to or from the client over the network.

Impact on MsgReceive()

Finally, MsgReceive() is affected as well (in the networked case). Not all the client's data may have been transferred over the network by Qnet when the server's MsgReceive() unblocks. This is done for performance reasons.

There are two flags in the struct _msg_info that's passed as the last parameter to MsgReceive() (we've seen this structure in detail in “Who sent the message?” above):

msglen: Indicates how much data was actually transferred by the MsgReceive() (Qnet likes to transfer 8 KB).
srcmsglen: Indicates how much data the client wanted to transfer (determined by the client).

So, if the client wanted to transfer 1 megabyte of data over the network, the server's MsgReceive() would unblock and msglen would be set to 8192 (indicating that 8192 bytes were available in the buffer), while srcmsglen would be set to 1048576 (indicating that the client tried to send 1 megabyte).

The server then uses MsgRead() to get the rest of the data from the client's address space.