Guest IPIs

Reducing the frequency of guest-issued IPIs can improve the performance of the guest and of the overall system.

The cost of an inter-processor interrupt (IPI) between physical CPUs typically takes less than a microsecond when initiated by an operating system running directly on the hardware. While this overhead isn't extravagant, any excessive use of IPIs may affect system performance.

Just like any OS running running directly on hardware with multiple physical CPUs, a guest OS running in a VM with multiple vCPUs may need to issue IPIs. From the perspective of the OS issuing the IPI, the IPI behavior is exactly the same regardless of whether the OS is running directly on hardware or as a guest in a VM.

However, the time overhead cost of an IPI issued by a guest OS in a VM is an order of magnitude greater that the cost of an IPI issued by an OS running directly on hardware. This cost is similar to the cost of a guest exit–entrance cycle, which typically can take 10 microseconds, sometimes longer.

Since a guest OS runs in a VM and its CPUs are in fact vCPUs (threads in the hosting qvm process instance), when a guest issues an IPI, the IPI source is a vCPU thread, and each IPI target is another vCPU thread.

The relatively high cost of guest-issued IPIs is due to the work required by the hypervisor to prepare and deliver these IPIs; that is, to the work that must be done by software rather than by hardware to deliver the IPI from its source vCPU thread to its target vCPU thread(s).

Note: Just as an IPI issued by a system running directly on hardware can target multiple physical CPUs, an IPI issued by a guest in a VM can target multiple vCPUs.

Preparing the IPI for delivery

The hypervisor tasks described below prepare a guest-issued IPI for delivery. They are the same regardless of the board architecture or the state of the target vCPU thread.

When the guest OS issues an IPI, the hypervisor must:

Trap the request in the source vCPU thread.
Write the hosting VM's interrupt controller registers (virtual ICRs) with the information that will be needed to deliver the guest-issued IPI to the target vCPU or vCPUs.
Compile this information into a list of target vCPUs and interrupt numbers.
For each target vCPU, inject the relevant interrupt number. That is, for each target vCPU thread prepare the data and logic that will cause the vCPU to see that it has an interrupt pending when it resumes.

Delivering the IPI

From this point forward, the work required to deliver a guest-issued IPI to a vCPU is the same as for delivering any interrupt to a vCPU, regardless of the interrupt's source. This work differs according to the state of the target vCPU thread, and some boards support posted interrupts, which can reduce overhead.

Target vCPU thread isn't executing

If the target vCPU isn't executing guest code (the guest is stopped on an HLT or WFI), then the host will see the target vCPU thread as stopped on a semaphore. In this case, assuming that the source vCPU thread prepared the IPI for delivery:

Normal QNX Neutrino OS scheduling policies apply and the target vCPU thread will begin executing again when its turn comes.
When the vCPU thread starts executing, it will pick up the guest-issued IPI from its VM's virtual ICRs.

Target vCPU thread is executing

If the target vCPU thread is executing:

The hypervisor must trigger an IPI on the physical CPU where the guest code is executing; if posted interrupts aren't supported by the hardware or aren't enabled, the interrupt on the physical CPU forces a guest exit and immediate re-entrance.
When the guest re-enters, the target vCPU thread will pick up the pending guest-issued IPI from its VM's virtual ICRs.

However, note the following:

If the target vCPU thread is executing vdev code, no forced guest exit is needed. Virtual devices are part of the hypervisor; when execution passes from the virtual device back to the guest, the target vCPU will pick up the pending guest-issued IPI from its VM's virtual ICRs.
If the hardware supports posted interrupts (currently some x86 boards) and this capability is enabled, then no guest exit is required to deliver a guest-issued IPI to its target vCPU(s). The hypervisor's IPI on the physical CPU triggers delivery of the guest-issued IPI, and the guest learns about the interrupt without having to exit.

Reducing the frequency of guest-issued IPIs

The costliest tasks in the preparation and delivery of a guest-issued IPI to its target vCPU(s) are:

Compiling the list of target vCPUs. There is no mechanism for circumventing this task.
The guest exit–entrance cycle. This cost can be reduced by enabling posted interrupts, if the hardware supports this capability.

Given the high cost of guest-issued IPIs, even on boards with posted interrupt support, reducing the frequency of guest-issued IPIs can improve both guest and overall system performance. This reduction can often be achieved by managing which CPUs (in fact, vCPUs) a guest application runs on; in this case, by binding the relevant processes in the guest to a single CPU:

If the threads of a multi-threaded process in the guest frequently interact, bind the process to a single vCPU (in fact, single qvm process vCPU thread).
If several processes are communicating frequently, bind them to the same vCPU.

With these configurations, the applications will behave as though they are running on single-CPU system and won't issue IPIs that require hypervisor intervention to deliver.

You can bind guest processes to a vCPU just like you bind a process in a non-virtualized system to a physical CPU. From the perspective of the guest, the binding applies to a physical CPU. For a Linux guest, use the taskset command (see your Linux documentation). For a QNX Neutrino OS guest, use the on command with the -C option; for example:

on -C 1 foo

binds the program foo to CPU 1 (from the guest's perspective), which is in fact a qvm process vCPU thread (see the on utility in the QNX SDP Utilities Reference).

In some cases, if you are running a single-threaded application, it may even prove advantageous to run that application in its own guest OS running in a VM on its own dedicated physical CPU.

Note: The set option's exit-on-halt argument allows you control if a WFI (ARM) or HLT (x86) instruction causes a guest exit (see set in the “VM Configuration Reference” chapter).