loadbalance

Qnet decides which links to use for sending packets, depending on current load and link speeds as determined by io-pkt*. A packet is queued on the link that can deliver the packet the soonest to the remote end. This effectively provides greater bandwidth between nodes when the links are up (the bandwidth is the sum of the bandwidths of all available links) and allows a graceful degradation of service when links fail.

If a link does fail, Qnet switches to the next available link. By default, this switch takes a few seconds the first time, because the network driver on the bad link will have timed out, retried, and finally died. But once Qnet "knows" that a link is down, it will not send user data over that link. (This is a significant improvement over the QNX 4 implementation.)

The time required to switch to another link can be set to whatever is appropriate for your application using command-line options of Qnet. See lsm-qnet.so documentation.

Using these options, you can create a redundant behavior by minimizing the latency that occurs when switching to another interface in case one of the interfaces fail.

While load-balancing among the live links, Qnet sends periodic maintenance packets on the failed link in order to detect recovery. When the link recovers, Qnet places it back into the pool of available links.

Note: The loadbalance QoS policy is the default.