Simple restart

QNX SDP8.0High Availability Framework Developer's GuideDeveloper

The most basic form of recovery is the simple death-restart mechanism. Since the QNX OS provides virtually all non-kernel functionality via user-installable programs, and since it offers complete memory protection, not only for user applications, but also for OS components (device drivers, filesystems, etc.), a resource manager or other server program can be easily decoupled from the OS.

This decoupling lets you safely stop, start, and upgrade resource managers or other key programs dynamically, without compromising the availability of the rest of the system.

Consider the following code, where we restart the mqueue daemon:

/* simple_restart.c */

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/netmgr.h>
#include <fcntl.h>
#include <ha/ham.h>

int main(int argc, char *argv[])
{
    int status;
    char *mqueuepath;
    ham_entity_t *ehdl;
    ham_condition_t *chdl;
    ham_action_t *ahdl;
    int mqueuepid;
    if (argc > 1)
        mqueuepath = strdup(argv[1]);    
    else 
        mqueuepath = strdup("/sbin/mqueue");
    if (argc > 2)
        mqueuepid = atoi(argv[2]);
    else
        mqueuepid = -1;
    ham_connect(0);
    ehdl = ham_attach("mqueue", 0, mqueuepid, mqueuepath, 0);
    if (ehdl != NULL)
    {
      chdl = ham_condition(ehdl,CONDDEATH, "death", HREARMAFTERRESTART);
      if (chdl != NULL) {
        ahdl = ham_action_restart(chdl, "restart", mqueuepath, 
                              HREARMAFTERRESTART);
        if (ahdl == NULL)    
            printf("add action failed\n");
      }
      else
          printf("add condition failed\n");
    }
    else
        printf("add entity failed\n");
    ham_disconnect(0);
    exit(0);
}

The above example attaches the mqueue process to a HAM, and then establishes a condition death and an action restart under it.

When mqueue terminates, the HAM will automatically restart it by running the program specified by mqueuepath. If mqueue were already running on the system, we can pass the pid of the existing mqueue into mqueuepid and it will be attached to directly. Otherwise, the HAM will start and begin to monitor mqueue.

You could use the same code to monitor, say, slogger2 (by specifying /usr/sbin/slogger2), etc. Just remember to specify the full path of the executable with all its required command-line parameters.

Page updated: