Configuring VM components

When you specify options to configure a qvm process you are assembling and configuring the components of a virtual machine (VM) in which a guest will run.

Note: For descriptions of vdev and pass-through device configurations, see Configuring vdevs and Configuring pass-through below.

cmdline

The cmdline option is used to pass a string (commandline) to a guest as though the string had been entered from the command line. The string can be passed to a Linux kernel or to multiboot (ELF) images used for QNX guests.

The cmdline option makes command-line input available to the guest in various ways, depending on the guest image format and the guest architecture. How the input is interpreted is the responsibility of the guest.

Syntax:

cmdline commandline

Example:

The following connects a terminal console and sets up log and debug parameters for a Linux guest running on a virtualized ARM platform:

cmdline "console=ttyAMA0 earlycon=pl011,0x1c090000 \
debug user_debug=31 loglevel=9"

cpu

The cpu option creates a new vCPU in the VM. Every vCPU is a thread, so a thread runmask can be used to restrict the vCPU to a specific pCPU (or several pCPUs). Similarly, standard thread scheduling priorities and algorithms can be applied to the vCPU. Note that vCPU threads are threads in the hypervisor host domain.

If no cpu is specified, the qvm process instance creates a single vCPU.

Syntax:

cpu [options]*

Options:

partition name
If adaptive partitioning (APS) is implemented in the hypervisor host domain, the vCPU will run in the host domain's APS partition specified by name. If the partition option isn't specified, the vCPU thread will run in the partition where the qvm process was started.
runmask cpu_number{,cpu_number}
Allow the vCPU to run only on the specified physical CPU or CPUs. CPU numbering is zero-based. Default is no restrictions (floating).
Assigning runmasks to vCPUs implies some important design choices. If a vCPU is allowed to float (no runmask is set, or the runmask includes more than one CPU) then the vCPU may migrate.
Migration is useful on some systems, because with migration a vCPU will move to a core that is free (if the current runmask permits). However, for some realtime guests, assigning a vCPU to one core that isn't shared with other vCPUs can improve realtime determinism.
QNX recommends that you first use floating vCPUs in your design, then move to restrict or prohibit migration (i.e., core pinning) as required.
sched priority[r | f | o ]
    sched high_priority,low_priority,max_replacements,replacement_period,initial_budget s
Set the vCPU's scheduling priority and scheduling algorithm. The scheduling algorithm can be round-robin (r), FIFO (f), or sporadic (s). The o (other) scheduling algorithm is reserved for future use; currently it is equivalent to r.
The default vCPU configuration uses round-robin scheduling for vCPUs. Our testing has indicated that most guests respond most favorably to this scheduling algorithm. It allows a guest that has its own internal scheduling policies to operate efficiently.
Note: You should consult with QNX engineering support before using scheduling such as FIFO or sporadic.

Configuring sporadic scheduling

For sporadic scheduling, you need to specify the following five parameters:

Maximum vCPUs per guest

The maximum number of vCPUs that may be defined for each guest running in a hypervisor VM is limited by a number of factors:

Hardware
On supported Aarch64 (ARMv8) and x86-64 platforms, the hardware currently allows a maximum of 254 vCPUs on the board. This number may change with newer hardware.
Specific hardware components may also limit the number of vCPUs per guest. For example, on x86-64 boards, the LAPIC virtualization limits a guest to a maximum of 15 vCPUs. Similarly, on Aarch64 boards, the maximum number of vCPUs per guest is limited by the GIC version in which the GIC vdev is running. With the GIC vdev running in GICv2 mode the maximum number of vCPUs that can be assigned to a guest is eight (8).
QNX recommends that you don't give a guest more vCPUs than there are physical CPUs on the underlying hardware platform.
Guest OS
Currently, QNX Neutrino OS 6.6 and 7.0 support a maximum of 32 CPUs. This limit also applies to vCPUs, since a guest OS makes no distinction between a CPU and a vCPu.
Check the latest documentation for your guest OSs (QNX Neutrino and Linux) for more information about the maximum number of CPUs they support.

Example 1: pin vCPU, set scheduling priority

The following creates a vCPU that is permitted to run only on physical CPU 3 (numbering is zero-based):

cpu runmask 3 sched 8r

Priority is 8. The scheduling algorithm is round-robin.

Example 2: floating vCPUs, set scheduling priority

The following creates four vCPUs (0, 1, 2, 3), all with priority 10:

cpu sched 10
cpu sched 10
cpu sched 10
cpu sched 10

The runmask option isn't specified, so the default (floating) is used.

Since no affinity has been specified for any of the vCPU threads, the hypervisor microkernel scheduler can run each vCPU thread on whatever available pCPU it deems most appropriate.

Example 3: two vCPUs pinned to pCPUs, default scheduling

The following creates four vCPUs (0, 1, 2, 3):

cpu runmask 2,3    # vCPU 0 may run only on pCPU 2 or 3.
cpu runmask 2,3    # vCPU 1 may run only on pCPU 2 or 3.
cpu                # vCPU 2 may run on any pCPU.
cpu                # vCPU 3 may run on any pCPU.

vCPUs 0 and 1 have their runmask options set to pin them to pCPUs 2 and 3. This allows them to run on pCPUs 2 and 3 only. They won't migrate to pCPU 0 or 1, even if these pCPUs are idle. No runmask option is specified for vCPUs 2 and 3, so they will use the default: float. They will be able to run on any available pCPU (including pCPUs 2 and 3).

For information about how priorities for hypervisor threads and guest threads and handled, see Priority-based scheduling in the QNX Virtual Environments chapter.

For more information about advanced configuration, see the QNX Hypervisor Public Forum: log in to Foundry 27 with your myQNX account, and go to: community.qnx.com/sf/sfmain/do/viewProject/projects.qnx_hypervisor_public_forum.

For more information about affinity and scheduling in QNX systems, see the QNX Neutrino OS documentation.

debug

Output debugging information about the guest vCPUs and vdevs.

Information output by debug goes to whatever destination is specified by the logger option's info filter (see logger below). If no output destination is specified for the logger info filter, debug output goes to slogger2.

After the qvm process instance has started, send a SIGUSR1 signal to it. The qvm process instance will output debug information, then immediately continue.

In most cases, you would specify the debug component in your commandline startup instructions when you need it, rather than specifying it in a configuration file.

Note: You can use the debug option to see the guest-host time drift (see Time in the QNX Virtual Environments chapter).

Syntax:

debug

Options:

None

dump

On receipt of a SIGUSR2 signal, write a dump file for the guest hosted in this qvm process instance. If the argument does not start with a pipe (|), write the dump file to the directory specified by directory. If the argument starts with pipe, spawn a shell command.

Syntax:

dump directory

Options:

directory
The output directory.

or:

|shell_command
If the first character of the argument is a pipe, the characters that follow it are spawned as a shell command in the guest, and the contents of the dump file are written to the shell's standard input.

The dump file

The dump file is called prefix-YYYYMMDD-hhmmsss.gcore where:

If a file with the name that the qvm process composes for the dump file already exists in the specified directory, the qvm process attempts to generate a unique name by inserting a hyphen and a number (from 1 to 9: -1, -2, ... -9) before the .gcore extension. For example, if the files qnx7-20170821-120619 and qnx7-20170821-120619-1 exist, the qvm process creates a file called qnx7-20170821-120619-2. If the qvm process is unable to create a unique file name, it doesn't create a dumpfile.

The dump file is output in ELF64 format. A PT_NOTE segment has the register states for the vCPUs, and PT_LOAD segments have the guest memory contents. The format of the PT_NOTE segment is described by the sys/kdump.h public header file.

To interpret the dump file, you can use GDB and kdserver (see kdserver in the Utilities and Drivers Reference chapter).

Example:

On receipt of a SIGUSR2 signal, the following will execute the guest's shell command gzip >dump_output.gz, which compresses the guest's dump file and writes it to the spawned command line (gzip). The gzip will then write the compressed file to its stdout (dump_output.gz):

dump "|gzip >dump_output.gz"
Note: The quotation marks around the argument are used so that when qvm parses the configuration it reads the full shell command as a single qvm argument.

gdb

Enable guest debugging through GDB (see Using a GNU debugger (GDB) in the Monitoring, Troubleshooting, and Tuning chapter).

Syntax:

gdb [paused] port

Options:

paused
If this option is specified, the guest OS is paused at its first instruction. It is allowed to resume only when GDB has connected and the hosting qvm process receives a single-stepping or a continue command.
port
The port number for the GDB connection.

generate

See suppress | generate below.

load

Copy the contents of file into the guest system address space.

Syntax:

[blob_type] load [address,]file

Options:

address
The location in the guest where file is to be loaded. This address is the guest-physical address (the address as seen by the guest, not the host).
If address isn't specified, and the qvm process recognizes the file content type (blob_type), the qvm process loads the file content to the location indicated by this type. If the qvm processes can't identify the type of content, it loads the contents of file to the first available location it finds.
The only requirement for the address is that it be in the guest's allocated guest-physical memory (see Memory in the QNX Virtual Environments) chapter.
blob_type
The type of content (see below). If blob_type isn't specified, the qvm process attempts to identify the type of file content and load it appropriately (e.g., if it sees an ELF or Linux image format, it will load the data at the locations indicated by the file contents and configure the guest bootstrap CPU to begin execution at the entry point specified by the file).
If the first file being loaded doesn't have a recognized file format (again, with blob_type not specified), the qvm process configures the guest bootstrap CPU to begin execution at the first byte of the file.
If blob_type precedes a pass option, the qvm process passes the data of the specifed type to the guest (see Configuring pass-through in this chapter).
file
The path and name of the file to load.

The qvm process recognizes the following file content types (blob_type) specified before a load option:

Type Notes
acpi Available only for x86. The file contains ACPI table information; the qvm process will add this information to the ACPI information that it generates automatically.
data The file isn't considered to be a bootable image; the qvm process won't configure the bootstrap guest CPU to begin execution at the entry point indicated by the file contents. If the qvm process recognizes the file format, it will perform normal load processing (e.g., it will load an ELF file at the locations specified by the program headers).
fdt Available only for ARM. The file contains a flattened device tree (FDT) binary blob; the qvm process will add its automatically generated information to this FDT blob, write it all into the guest memory, and pass the location of this information to the guest OS during its boot up. For information about FDTs, see www.devicetree.org
guest The qvm process will load the contents of filepath into guest memory. These contents will be interpreted as being the guest OS boot image.
initrd The qvm process makes available to the Linux OS during its boot up the location and size of the file contents in the guest as the initial RAM disk file system.
raw The qvm process copies the file byte for byte into the guest, even if its format would normally be handled by the qvm process.

Example:

The following loads a QNX IFS into the address space for the guest that will run in the VM defined by the current qvm configuration file:

load /vm/images/qnx7.ifs

Since only the filepath and name are specified, the qvm process creating the VM will examine the blob and place it in the location specified in the ELF headers.

logger

Filter messages to output, and specify location where these messages are output. You may specify from zero (0) to many logger options for a VM.

Syntax:

logger filters, output_dest

Options:

filters
Output information for the specified message type or types. Specify one or more types, separated by commas (see below).
output_dest
The output destination for the messages specified by the filters. Specify only one (1) output per logger instance. Supported output destinations are stdout, stderr, and slogger2.

Log message are filtered by the severity of the condition that caused the message to be emitted. Supported message types are:

Filter Severity
fatal A fatal error; the qvm process can't continue.
internal An internal error in the qvm process has been detected; the process terminates immediately.
error An error that may or may not cause the qvm process to stop, but which indicates a problem that requires immediate attention.
warn The qvm process is able to continue, but has encountered a problem that should be addressed.
info Output information requested by the user (e.g., in response to a SIGUSR1 signal).
debug Provide information useful to QNX when debugging user-reported issues.
verbose Provide users with detailed information useful for debugging their system.
Note:

Filters are not like verbosity levels, where each verbosity level outputs increasingly trivial messages. Filters are combined. You must specify the filter for every type of message you want to output. For example, specifying logger error stderr outputs only errors. It doesn't output errors and fatal errors. To get both you must specify logger error,fatal stderr.

If no logger is specified in the qvm configuration file, to ensure that as a minimum a record of errors and fatal errors is sent to stderr, the following logger configuration is nonetheless assumed:

logger error,fatal stderr

Examples:

The following are examples of how the logger option might be used. Note that in all the examples, only one output destination is specified per line:

logger warn,error,fatal stderr
logger info stdout
logger error,fatal,info,warn slog
logger internal,debug slog

pass

See Configuring pass-through in this chapter.

ram

The ram option allocates RAM to the VM, at the guest-physcial address specified by start_address for the number of bytes specified by length.

Syntax:

ram [start_address,]length

If you don't specify this option, by default the qvm process allocates the RAM immediately following the highest RAM/ROM address already allocated in the system.

When you specify the length argument, you can use either hexadecimal notation (e.g., 0xa0000 for 640 KB), or human-readable abbreviations (e.g., 16M for 16 MB).

Note:

You must allocate RAM before components that refer to the guest memory (see Exceptions above).

RAM that appears to the guest as contiguous is unlikely to be contiguous in physical memory (see Memory in the QNX Virtual Environments) chapter.

Example:

The following allocates 128 MB of RAM to the VM, starting at address 0x80000000:

ram 0x80000000,128M

The start address for the memory allocation is the address inside the VM: the guest physical address. If this RAM location is to be used for the guest's bootable image, the guest must be configured to look for the image at this address inside its memory. For example, assuming the example above is for a QNX guest, you must specify this location in the guest's buildfile:

[image=0x80000000]
[virtual=aarch64le,elf] .bootstrap = {
   [+keeplinked] startup-armv8_fm -v -H
   [+keeplinked] PATH=/proc/boot procnto -v
}

Allocating memory on x86 boards

Due to the long history of the x86 platform, it is likely that a guest OS will expect the 128 KB of VGA memory to be mapped at 0xa0000. Though the hypervisor doesn't offer emulation of a VGA device, you should reserve this region in guest-physical memory so that the qvm process assembling the VM can't allocate it (see reserve below).

Similarly, the region between 0xc0000 and 0xfffff has traditionally been known as the BIOS area. You should configure this area, but you should use the rom option to remove it from the memory map passed on to the guest.

Thus, to properly virtualize x86 hardware, instead of specifying memory allocations like this:

ram 1024M

you must reserve the location for the VGA, specify the ROM for the BIOS, and specify the RAM memory, so that your memory configuration for an x86 guest looks something like this:

ram 0,0xa0000
reserve loc 0xa0000,0x20000
rom 0xc0000,0x40000
ram 1m,1023m

where:

For cases where the guests doesn't expect to find a component (e.g., legacy device, BIOS) at a specific location, you may specify only the size of the memory allocation and let the qvm process decide on the location. However, if the guest expects memory to be available for a specific purpose at a specific location, you must respect these requirements, just as you would for an OS running directly on an x86 board.

See Configuring memory in the QNX Virtual Environments chapter.

Mapping PCI pass-through devices on x86

By default, PCI devices on x86 boards are mapped from 2G to 4G in the guest-physical memory. Therefore, for guests that require large amounts of memory and that use PCI, avoid allocating this memory range. For example, the following configuration allocates 4.5G of memory for the guest. Note that it avoids the 2G to 4G:

ram 0,0xa0000
rom 0xc0000,0x40000
ram 1M,1500M #Get 1.5 Gig
ram 4000M,3000M # Get 3 Gig

reserve

The reserve option reserves locations in guest-physical memory and/or guest interrupts to prevent the qvm process from allocating them to any component or device.

If a guest OS expects that specific locations will hold some predetermined artifact (such as a table), or that specific interrupts will be used for some predetermined purposes, you should use the reserve option to ensure that when the qvm process assembles the VM for the guest it doesn't allocate these locations and interrupts, and leaves them for you to allocate as your guest requires.

Any loc or intr option that follows a reserve option specifies, respectively, a reserved location in guest-physical memory, or a guest interrupt number.

Syntax:

reserve options

Options:

intr guest_intr
Prevent the qvm process from allocating a guest interrupt with the number specified by guest_intr.
loc location_spec [,length]
Prevent the qvm process from allocating the location specified by location_spec.

If the reserved location is a PCI location (pci:), you must use the BDF specification form:

pci:bus_number:device_number[.function_number]

and not specify length.

If the reserved location is a memory location (mem:), you must specify length.

For more information about locations, see Guest resource types.

Note: The reserve option prevents the qvm process from dynamically allocating the locations and interrupts specified, but it doesn't prevent the qvm process from allocating the locations and interrupts if these are explicitly specified in the VM configuration.

rom

The rom component allocates ROM to the VM, at the guest-physical address specified by start_address for the number of bytes specified by length.

This option is the same as the ram option, except that the guest isn't allowed to write to this memory. Typically, you would use the load option to place some initialized data into the region specified by rom before the guest starts.

If you don't specify this option, by default the qvm process allocates the RAM immediately following the highest RAM/ROM address already allocated in the system.

When you specify the length argument, you can use either hexadecimal notation (e.g., 0xa0000 for 640 KB), or human-readable abbreviations (e.g., 1M for 1 MB).

Note:

You must allocate ROM before components that refer to the guest memory (see Exceptions above).

ROM that appears to the guest as contiguous is unlikely to be contiguous in physical memory (see Memory in the QNX Virtual Environments chapter).

Example:

The following allocates 256 KB of ROM to the VM, starting at address 0xc0000:

rom 0xc0000,0x40000

The start address for the memory allocation is the address inside the VM: the guest physical address (intermediate physical address).

suppress | generate

Explicitly request the suppression or generation of the specified system information table type in the guest.

By default the QNX Hypervisor always generates an FDT when it runs on an ARM platform, and ACPI information when it runs on an x86 platform. It doesn't currently support ACPI on ARM architectures or FDTs on x86 architectures, so the generate property is for future use only.

Syntax:

[system_table_type] suppress|generate

The system_table_type argument specifies the type of system information table to suppress or generate; this can be either fdt (ARM) or acpi (x86).

Examples:

The following suppresses the FDT system information table in an ARM guest:

fdt suppress

The following suppresses the ACPI system information table in an x86 guest:

acpi suppress

system

The system option specifies the name (system_name) of the VM being configured by the present qvm configuration file or command line startup. If specified, it must be the first entry (other than comments) in a qvm configuration file or command line string.

Syntax:

system system_name

Example:

The following specifies that the name of the VM created with the current qvm configuration file is “qnx7a”:

system qnx7a

If you use the virtio-net vdev's peer option, the full name of the end of the peer link is formed from the value of system_name concatenated with the value specified for the virtio-net vdev's name option (see vdev virtio-net).

tolerance

To support guest system virtualized timer hardware (e.g., the HPET vdev on x86), the hypervisor uses QNX host system timers to generate notifications when a guest timer interrupt needs to be delivered.

These QNX timer notifications aren't delivered to the hypervisor exactly at the requested point of timer expiry – there may be a delay between timer expiry and notification delivery. Since the guest's interrupts will be delivered later than it expects, this delay may cause inaccuracies in the virtualization of the guest.

The tolerance option sets the maximum allowed amount of time, specified in microseconds, that the host system is allowed to delay before notifying the hypervisor of a timer expiry. If the tolerance option isn't specified, the tolerance defaults to 10 percent of the host system clock period (which defaults to 1ms, making the default tolerance 100us).

Syntax:

tolerance microseconds

Option:

microseconds
The emulated timer device tolerance, in microseconds. Default is 10 percent of the timer tick period.
CAUTION:
Setting a lower value for the tolerance will improve guest interrupt delivery latency, but it may also increase system load and thus decrease overall performance.

unsupported

Specify what the hypervisor should do if it can't successfully complete the emulation of a guest instruction.

Syntax:

unsupported instruction | reference | register abort | fail | ignore
Note: Only one type-action pair can be used per unsupported option instance. Specify the option for each unsupported type you want to configure.

Types:

instruction
The hypervisor does't support emulation of the instruction. If unspecified, defaults to fail.
reference
The instruction references a memory (or IO port on x86) location to which the guest doesn't have access. If unspecified, defaults to ignore.
register
The guest is attempting to use a system register (ARM) or MSR (x86) that the hypervisor doesn't support. If unspecified, defaults to fail.

Actions:

abort
Display a message indicating the failure and guest state, then terminate the qvm process instance hosting the offending guest.
fail
Deliver an appropriate CPU exception to the offending guest.
ignore
Treat the instruction as a no-op, except in the case of guest attempts to read from an unsupported memory or IO port location, which sets the destination of the instruction to all one bits (~0).

If the hypervisor encounters a condition of a type without a specified action, the hypervisor uses its default action for that type.

Example:

The following configuration specifies that:

unsupported register abort
unsupported reference fail

The instruction type hasn't been specified, so the hypervisor will use its default configuration, which for this type is fail.

user

Assign the specified values to the user name, or to the user ID and group IDs of the qvm process instance.

Syntax:

user username[,gid]* | uid[:gid[,gid]*]

If you use the username[,gid]* form, the primary group is the one specified for username in /etc/passwd.

Options:

username
The user name to assign to the qvm process instance.
uid
The user ID to assign to the qvm process instance.
gid
One or more group IDs (separated by commas) to assign to the qvm process instance.

The user is a user in the hypervisor host, not in the guest that the VM being configured will host.

vdev

See Configuring vdevs in this chapter.