Using the High Availability Framework

The High Availability Framework provides components not only for detecting when processes terminate, but also for recovering from such terminations.

The main component is a process called the High Availability Manager (ham) that acts as a "smart watchdog". Your processes talk to ham using the HAM API. With this API you basically set up conditions that ham should watch for and take actions when these conditions occur. For example, you can tell ham to detect when a process terminates and automatically restart it. The HAM can even detect the termination of daemon processes.

In fact, the High Availability Manager can restart a number of processes, wait between restarts for a process to be ready, and notify the process that this is happening.

The HAM also does heartbeating: processes can periodically notify ham that they're still functioning correctly; if a process-specified amount of time goes by between notifications, then ham can take some action.

The above are just a sample of what's possible with the High Availability Framework. For more information, see the High Availability Framework Developer's Guide.