Watchdogs

QNX hypervisors provides virtual watchdog devices that you can use in a VM just as you would a hardware watchdog on a board in a non-hypervisor system.

DANGER:

It is your responsibility to determine how best to use hardware and software watchdogs for the hypervisor host as well as for guests. This is of particular importance in safety-related systems.

If you need advice about how to implement watchdogs, please contact your QNX representative.

Watchdogs in the hypervisor host

The QNX Hypervisor provides the same support for a watchdog as the QNX OS microkernel, of which it is a superset. If a hardware watchdog exists, you can use a board-specific utility to kick it, and use the High Availability Manager (HAM) or the System Launch and Monitor (SLM) service to manage the host in the event that a watchdog detects an anomaly in the host's behavior (see ham and slm in the QNX OS Utilities Reference).

To learn more about your board-specific watchdog kicker utility, see your BSP User's Guide.

Watchdogs in a guest

The watchdog vdevs implemented with QNX hypervisors emulate a hardware watchdog in a VM. A watchdog kicker utility running in the guest enables its virtual hardware watchdog, then writes at specific intervals to guest-physical registers monitored by the watchdog vdev to inform the watchdog that the guest is running (it kicks the watchdog).

If the watchdog kicker fails to write to the registers within the required delay, the watchdog vdev can trigger an appropriate action, such as forcing the qvm process instance to exit. This triggering of a follow-up action is known as the watchdog bite.

Since the qvm process is a QNX OS process, you can trap its exit codes just as you do for any OS process. If you're implementing a watchdog service for your guests, you should configure your hypervisor host to trap the qvm process exit codes so you can decide what you want the host to do in the event that a qvm process instance terminates unexpectedly (see qvm process exit codes in the Monitoring and Troubleshooting chapter). These actions can range from simply logging the error and waiting for user intervention, to using the HAM to attempt to restart the qvm process and its guest OS.

For information about how to cause the guest to dump in response to a watchdog bite, see Getting a guest dump in the Monitoring and Troubleshooting chapter.

CAUTION:
When a watchdog vdev terminates a qvm process instance in an orderly manner, this termination necessarily also stops the guest from executing. From the guest's perspective, this is not an orderly termination.

Implementing a watchdog in a guest

To implement a watchdog service in a QNX guest running in a QNX hypervisor VM, you need to:

  • Include the appropriate watchdog vdev (wdt-sp805 for ARM or wdt-ib700 for x86) in the configuration for the VM that will host the guest (see vdev wdt-sp805 and vdev wdt-ib700 in the Virtual Device Reference chapter).
  • Implement a watchdog kicker in the guest, and configure it to kick the watchdog vdev (for QNX guests the kicker is provided in the BSP, see QNX guest watchdog kicker (wdtkick) below).

    For Linux and Android guests, see Implementing a watchdog in a Linux or Android guest below.

  • Use the HAM and/or SLM utilities in the host to manage the qvm process in the event that a watchdog detects an anomaly in a guest's behavior and its hosting qvm process instance has exited. For example, the SLM could restart the qvm process instance with the same VM configuration it had before it exited (see ham and slm in the QNX OS Utilities Reference).
Note:

The watchdog vdevs emulate a subset of the functions provided by hardware watchdogs. Refer to the chip documentation for information about your hardware watchdog (e.g., the SP805 documentation at the ARM Documentation Center).

Starting and stopping watchdogs and their kickers

QNX hypervisor BSPs include scripts for starting and stopping watchdogs in your QNX guests. These are located in the BSP's scripts directory:

watchdog-start.sh
Enables the watchdog vdev in the VM (qvm process instance), and starts the kicker utility (wdtkick) in daemon mode in the guest.
watchdog-stop.sh
Disables the watchdog kicker vdev in the guest, then immediately writes to the guest-physical register the value needed to stop the watchdog vdev in the VM before it notices that the kicker has stopped.

To start or stop your watchdog vdev and the kicker in the QNX guest, run either watchdog-start.sh or watchdog-stop.sh, as appropriate.

QNX guest watchdog kicker (wdtkick)

The wdtkick watchdog kicker is board-specific for QNX guests (see wdtkick in the QNX OS Utilities Reference). It is shipped in the BSPs for supported boards. For QNX hypervisor systems, it is included in the BSPs for QNX guests at src/hardware/support/wdtkick.

The buildfiles for the QNX guests made available with the hypervisor include commented-out sections with configurations for wdtkick. To use this utility, you can uncomment these sections and rebuild your guest.

Below are examples of wdtkick configurations for ARM and for x86 guests.

wdtkick in an ARM guest (SP805 emulation)

Kick the watchdog every second (1), and set the timeout period to three (3) seconds, which in fact translates to six (6) seconds because the timer on ARM platforms counts down twice and asserts the reset only on the second timer expiry:
wdtkick -v -a 0x1C0F0000 -t 1000 -E 8:3 -W 0:0x47868C0

Where:

  • -v sets the verbosity
  • -a 0x1C0F0000 sets the base address the watchdog will use in guest-physical memory
  • -t 1000 sets the watchdog kick interval to one second (1000 milliseconds)
  • -E 8:3 sets the offset at which to write in the watchdog the register (8) to enable the timer, and the mask to use when writing
  • -W 0:0x47868C0 sets the offset at which to write in the watchdog register (0), and the value to write there (0x47868C0, which specifies 3 seconds at 25 MHz)
Note:
Refer to the SP805 specifications and your board manufacturer's documentation for more information about how to configure your watchdog and watchdog kicker.

wdtkick in an x86 guest (IB700 emulation)

Kick the watchdog every five (5) seconds, and set the timeout period to 10 seconds:
wdtkick -v -a 0x441 -t 5000 -w 8 -W 2:0xA

Where:

  • -v sets the verbosity
  • -a 0x441 sets the base address the watchdog will use in guest-physical memory
  • -t 5000 sets the watchdog kick interval to five seconds (5000 milliseconds)
  • -w sets the width of the watchdog write register to eight (8) bits
  • -W 2:0xA sets the offset at which to write in the watchdog register (2), and the value to write there (0xA, which specifies 10 seconds)
Note:
Refer to the IB700 specifications and your board manufacturer's documentation for more information about how to configure your watchdog and watchdog kicker.

Watchdog kicker configuration

You can enter the watchdog kicker configuration via the command line at startup, or you can modify your guest startup to store it in the guest system page's hwinfo section (see the System Page chapter in Building Embedded Systems).

For up-to-date information about your board-specific watchdog kicker utility, see the wdtkick.use file included with the BSP.

For information about the watchdog vdevs, see vdev wdt-sp805 (ARM) and vdev wdt-ib700 (x86) in the Virtual Device Reference chapter.

Implementing a watchdog in a Linux or Android guest

If you want to use a watchdog in a Linux or Android guest, you must do the following:

  1. Enable the watchdog module – the Linux or Android kernel must include the correct watchdog kernel module for your target (IB700 or SP805).

    In most Linux and Android distributions, this module isn't enabled by default. See menuconfig, and the Linux kernel configuration documentation's Device Drivers section for details on how to rebuild a Linux kernel with the watchdog module enabled.

  2. Implement a Linux or Android application or shell script to control the watchdog. Documentation on how to control a Linux watchdog is publicly available on the internet. A useful example can be found at www.kernel.org/doc/Documentation/watchdog/watchdog-api.txt.
Page updated: