This version of this document is no longer maintained. For the latest documentation, see http://www.qnx.com/developers/docs.

Supporting Vector Floating Point Functionality for ARM Processors

This technote explains how to use the QNX Neutrino vector floating point (VFP) functionality support for ARM processors.

Overview

The ARM floating point functionality is implemented in software because the majority of ARM processors don't have built-in floating point hardware.

The VFP functionality support is enabled when the startup program detects the presence of VFP hardware and sets the system page CPU_FLAG_FPU flag.

The QNX Neutrino procnto and procnto-v6 provides the basic support for managing the VFP context on a per-thread basis. This support uses a lazy mechanism to minimize the cost of saving and restoring the VFP register context as follows:

The VFP is disabled initially.
Any VFP access causes an undefined instruction exception, allowing the kernel to initialize the VFP context for a thread.
First access to the VFP by a thread causes an exception. The kernel exception handler:
- allocates storage for the VFP context
- initializes the VFP state
- records this thread as owning the VFP
- restarts the VFP instruction
VFP is disabled on a context switch to allow the kernel to trap subsequent VFP accesses for managing the per-thread VFP context:
- If the access is performed by the thread owning the VFP, the VFP already contains the correct VFP state. In this case, the VFP is simply reenabled and the VFP instruction is restarted.
- If the access is performed by a different thread, a VFP context switch is required:
  - The VFP state is saved in the owning thread's VFP storage.
  - If this is the first access performed by the thread, storage is allocated for the VFP context, and its VFP state is initialized.
  - The VFP state is restored from the new thread's VFP storage.
  - The new thread is recorded as owning the VFP.
  - The VFP instruction is restarted.

Restrictions on VFP usage

The VFP support implements only the RunFast mode of operation:

trapped floating point exceptions are disabled
the Round-to-nearest mode is enabled
the Flush-to-zero mode is enabled
default NaNs are enabled

Both Round-to-nearest and Flush-to-zero are defined in the IEE754 standard.

This mode of operation doesn't require any software support code. However, it doesn't provide full IEE754 compliance. For full details about the RunFast mode of operation, see the ARM Architecture Reference Manual, published by ARM.

No software emulation or support code is provided for the VFP instruction set. This means that code using VFP instructions can be run only on a processor that implements VFP hardware. Executing VFP instructions without the presence of VFP hardware will result in a SIGFPE signal.

The standard QNX Neutrino libraries are compiled to use a soft-float implementation for floating-point operations to ensure the code can run on all supported ARM processors.

The soft-float implementation passes floating-point arguments and results in ARM registers, or on the stack. The code that uses VFP instructions for floating point must use the same argument (or result mechanism) to ensure that it can interoperate correctly with code compiled for soft-float.

A VFP enabled math library, called libm-vfp.so, is provided and can be used on targets that implement VFP hardware in two ways:

if your developement will target only systems that implement VFP hardware, you can replace the armle/lib/libm.so link to point at libm-vfp.so.2 instead of libm.so.2. This means that linking with -lm will use libm-vfp.so instead, and mkifs will build libm-vfp.so into an image instead of usinglibm.so.2
if your development targets a variety of ARM processors, some of which do not implement VFP hardware, you can use the following when targetting VFP processors:
- use -lm-vfp instead of -lm when linking your application, as well as specifying the compilation flags described below
- specify libm.so=libm-vfp.so in your target build file to ensure that mkifs builds the image with libm-vfp.so instead of using libm.so

BSP configuration

The startup program is responsible for detecting the presence of VFP hardware:

Processor	Detection is performed by
ARMv6 processors	`libstartup.a`, armv_setup_v6()
Other processors	Board-specific startup routines

If the VFP is present, the startup program should ensure that:

coprocessor access to the VFP is enabled (if the processor supports selective access to individual coprocessors)
the VFP unit is disabled (the fpexc register's EN bit is set to 0)
the CPU_FLAG_FPU flag is set in the cpuinfo_entry->flags field for the CPU

Using VFP instructions

You must use the following gcc and binutils versions to allow the compiler and assembler to use VFP instructions.

gcc 4.2
binutils 2.18

The following compiler flags are required:

-mfpu=vfp -mfloat-abi=softvfp

This causes gcc to use VFP instructions for floating point, and to use the soft-float ABI (application binary interface) to pass floating-point arguments and results in ARM registers. The ABI defines the calling convention of register usage for arguments and results in procedure calls.

Application code must not change FPSCR bits that would change the mode to anything other than the RunFast operation:

The default NaN mode (bit 25) must be set to 1.
The bit (bit 24) that denotes Flush-to-zero mode must be set to 1.
The rounding mode (bit 23:22) must be set to 00 to enable the Round-to-nearest mode.

If these bits are altered, it's possible for the VFP to generate exceptions that require software support. Since no software support is currently provided, this will result in a SIGFPE signal.