Writing Network Drivers for io-sock

QNX SDP8.0High-Performance Networking Stack (io-sock) User's GuideAPIArchitecture

An io-sock network driver can be viewed as the glue between the underlying network hardware and the software infrastructure of io-sock. You can picture the driver as having a bottom and top half: the bottom is code specifically for the particular hardware it supports, and the top is code specifically for io-sock.

This appendix deals specifically with the top half of the driver, which deals with the io-sock software infrastructure.

Driver APIs

The io-sock networking stack is derived from FreeBSD's networking stack and io-sock drivers are written using the FreeBSD driver APIs. In the following discussion and examples, the API function calls and structures (up to and including Modifying the makefile) are the same as FreeBSD. This common API allows you to compile FreeBSD driver source code for io-sock with few to no code changes. However, this design also means that the io-sock build environment is different from a typical QNX build environment.

When you are developing a network driver for io-sock, you must include io-sock header files (found in usr/include/devs), which include the required QNX header files. Don't include the QNX header files directly—it can result in compilation errors.

For example, some common functions are quite different when writing io-sock drivers. The following APIs are defined in io-sock header files:

void* malloc(size_t size, struct malloc_type *type, int flags);
void free(void *addr, struct malloc_type *type);    

QNX header files define APIs with the same names and a different syntax:

void* malloc( size_t size );
void free( void* ptr );

You can add the io-sock version of the header files by including devs.mk, which specifies the proper search paths, in your common.mk (see Modifying the makefile) .

For the best guide to the use of the FreeBSD driver APIs, go to the FreeBSD 13.3 manual pages at https://www.freebsd.org/cgi/man.cgi and search in the section 9 - Kernel Interface.

Driver versioning

To work with io-sock, a driver must specify its version using IOSOCK_VERSION_CUR, which is located in qnx_modload.h. The io-sock networking manager checks the version value when the driver loads, and rejects incompatible driver libraries. A complete list of available versions is included in qnx_modload.h.

The driver must include a SYSUNINIT macro to provide a function that is invoked when a driver is unmounted. This function can be empty. The optional SYSINIT macro can be used to specify an initialization function, if required.

For example:

…
#include <qnx/qnx_modload.h>
…
int drvr_ver = IOSOCK_VERSION_CUR;
SYSCTL_INT(_qnx_driver, OID_AUTO, sam_drvr, CTLFLAG_RD, &drvr_ver, 0,
            "Version");
…
struct _iosock_module_version iosock_module_version = IOSOCK_MODULE_VER_SYM_INIT;
static void
sam_uninit(void *arg)
{
}

SYSUNINIT(sam_uninit, SI_SUB_DUMMY, SI_ORDER_ANY, sam_uninit, NULL);

On a running target, you can use sysctl to display driver versions. For example:

sysctl qnx.driver
qnx.driver.em: 65537
qnx.driver.vmx: 65537
qnx.driver.vtnet_pci: 65537
# sysctl qnx.module
qnx.module.usb: 65537
qnx.module.pci: 65537
qnx.module.phy: 65537

Registering the driver module

The DRIVER_MODULE macro declares and registers the driver with the system. The driver is added to the list of device drivers for a particular bus type (e.g. simplebus, pci, uhub (USB), nexus). It is possible to write a driver that can be registered against multiple bus types.
DRIVER_MODULE(sample, simplebus, sample_driver, sam_devclass, 0, 0);
DRIVER_MODULE(miibus, sam, miibus_driver, miibus_devclass, 0, 0);
DRIVER_MODULE(etherswitch, sam, etherswitch_driver, etherswitch_devclass, 0, 0);
As shown in the example, the DRIVER_MODULE macro can be used multiple times in a driver to describe a device hierarchy. A common example of using the macro this way is using the MII bus framework to attach a PHY driver. Another, less common example is the EtherSwitch framework that can be used for Ethernet switch drivers. Both frameworks include device drivers that create child devices under the sample device. You can use the devinfo utility to see these devices. For example:
# devinfo
nexus0
  cryptosoft0
  ofwbus0
    simplebus0
      sam0
        miibus0
          ukphy0
        etherswitch0              

The MODULE_DEPEND macro should also be used to show the driver dependencies (excluding library API version dependencies). For example:

MODULE_DEPEND(sample, ether, 1, 1, 1);
MODULE_DEPEND(sample, mibus, 1, 1, 1);
MODULE_DEPEND(sample, etherswitch, 1, 1, 1);                
Although io-sock does not include explicit dependency checks, drivers won't load if there are missing dependencies. A driver’s declared dependencies can be extracted from the binary using objdump. For example:
$ objdump aarch64-dll.le.diag/devs-sample-diag.so -t | grep depend_on
0000000000008050 l     O .data	000000000000000c _sample_depend_on_etherswitch
0000000000008060 l     O .data	000000000000000c _sample_depend_on_miibus
0000000000008070 l     O .data	000000000000000c _sample_depend_on_ether
0000000000008080 l     O .data	000000000000000c _etherswitch_sam_depend_on_kernel
0000000000008090 l     O .data	000000000000000c _miibus_sam_depend_on_kernel
00000000000080a0 l     O .data	000000000000000c _sample_simplebus_depend_on_kernel 
The third parameter of the DRIVER_MODULE macro is a driver_t structure (e.g., sample_driver), which describes the driver. It provides the name of the device, a pointer to a list of methods, and the size of the private data (softc) required by the driver for each device instance. For example:
driver_t sample_driver = {
        "sam",
        sam_methods,
        sizeof(struct sam_softc),
};

The driver methods are contained in a device_method_t structure, which provides probe, attach, detach, and shutdown callback methods. The following example includes the callback methods for the MII bus and EtherSwitch framework:

static device_method_t sam_methods[] = {
        /* device callbacks */
        DEVMETHOD(device_probe, sam_probe),
        DEVMETHOD(device_attach, sam_attach),
        DEVMETHOD(device_detach, sam_detach),
        DEVMETHOD(device_shutdown, sam_shutdown),

        /* miibus callbacks */
        DEVMETHOD(miibus_readreg, sam_miibus_read_reg),
        DEVMETHOD(miibus_writereg, sam_miibus_write_reg),
        DEVMETHOD(miibus_statchg, sam_miibus_statchg),

#ifdef INCLUDE_ETHERSWITCH
        /* etherswitch callbacks */
        DEVMETHOD(etherswitch_getinfo, sam_es_getinfo),
        DEVMETHOD(etherswitch_readreg, sam_es_readreg),
        DEVMETHOD(etherswitch_readphyreg, sam_miibus_read_reg),
#endif

        DEVMETHOD_END
};

The supported callbacks can be found in the generated interface header file (e.g., device_if.h, miibus_if.h, etherswitch_if.h). A driver does not have to provide an implementation for all the callbacks. An implementation for a callback method may be provided by the parent bus or another bus further up the device tree. The framework also includes a default implementation. An example of a method that is provided by the parent bus is the device_identify method. A driver may include a device_identify method if it is registered to a bus such as the nexus0, which cannot identify devices. For example, to identify a virtual device or a memory mapped device that is not in the device tree blob (DTB).

If a driver supports multiple bus types, another instance of device_method_t with a different set of methods can be specified.

For more information, see the following FreeBSD documentation:

The PHY callbacks are discussed in Attaching a PHY driver with MII bus .

Checking if a device is supported

You provide the probe callback as part of driver module initialization (see Registering the driver module).

When a device is enabled on a bus, the probe function of every driver registered on that bus is called to determine if the device is supported by that driver. In the case of the sample driver, sam_probe() checks the hardware compatibility string that was assigned to the device in the DTB. For a PCI or USB device, similar functions are available to match the vendor or device identifiers. For example:

static int
sam_probe(device_t dev)
{
        if (!ofw_bus_status_okay(dev)) {
                return (ENXIO);
        }

        /* 
         * Check device's compatibility string against list of
         * supported devices
         */
        if (ofw_bus_search_compatible(dev, compat_data)->ocd_data == 0) {
                 return (ENXIO);
        }
        device_set_desc(dev, "Sample Ethernet Controller");

        return (BUS_PROBE_DEFAULT);
}

The value that the probe function returns determines which driver is the best one to use for the device. For example, BUS_PROBE_DEFAULT indicates that the device is a normal device matching a Plug and Play ID and is the normal return value for drivers to use. For a list and description of all the conventional return values, see https://www.freebsd.org/cgi/man.cgi?query=DEVICE_PROBE&sektion=9&manpath=FreeBSD+13.3-RELEASE+and+Ports.

For example, devs-re.so, devs-em.so, and devs-ix.so drivers are loaded. If you add a PCI device, regardless of which driver is used, all the probe functions of the drivers registered against the PCI bus are called (re_probe(), em_probe(), ix_probe()). The return values indicate that devs-re.so is the best driver to use and the re_attach() function is called. Although typically there is only one match, io-sock supports having multiple drivers that support the same hardware. For example, a system may have a generic driver that supports all PHY devices in addition to more specific drivers.

Because the device’s softc is re-initialized before every probe, its contents do not persist when multiple drivers are present. It is not safe to assume that any information found in the device’s softc during the probe will be available when the attach method is called.

Attaching a device

You provide the attach callback as part of driver module initialization (see Registering the driver module).

The attach function (e.g., sam_attach()) is executed for each device instance. This callback implementation should contain everything needed to initialize the hardware, allocate resources, attach to the interrupt, initialize PHY, and so on (see Allocating resources , Registering interrupts , Specifying the memory for Direct Memory Access , Initialize a network interface, Attaching a PHY driver with MII bus ).

The private data structure for each instance of a device can be fetched by using device_get_softc():

sam_attach(device_t dev) {
        struct sam_softc *sc;
        sc = device_get_softc(dev);
...

The size of the softc is specified when the driver is registered. The software context is automatically allocated and zeroed when the device is attached.

A node variable set by sam_attach() can provide a handle to a specific place within the DT file where hardware information related to that device is kept. For example:

...
    node = ofw_bus_get_node(dev);
...

Example of an Ethernet device in a DT file:

ethernet@4033c000 {
      compatible = "Hardware Descriptor string from DTB";
      reg = <0x0 0x4033c000 0x0 0x2000 0x0 0x4007c004 0x0 0x4>;
      interrupt-parent = <0x1>;
      interrupts = <0x0 0x39 0x4>;
      interrupt-names = "macirq";
      tx-fifo-depth = <0x5000>;
      rx-fifo-depth = <0x5000>;
      clocks = <0x4 0x2e 0x4 0x2e 0x4 0x38>;
      clock-names = "stmmaceth", "pclk", "tx";
      pinctrl-names = "default";
      pinctrl-0 = <0x1c>;
      phy-mode = "rgmii";
      status = "okay";

The following code from the example sam_attach() reads a string value from a DTB:

if (OF_getprop_alloc(node, "phy-mode", (void **)&phy_mode)) {
    if (strcmp(phy_mode, "rgmii") == 0) {
        sc->phy_mode = PHY_MODE_RGMII;
    }
    if (strcmp(phy_mode, "rmii") == 0) {
        sc->phy_mode = PHY_MODE_RMII;
    }
    OF_prop_free(phy_mode);
}

Using iflib for PCI drivers

The sample driver described throughout this section deals with MMIO devices that are described in a DTB. However, for PCI devices, QNX recommends using the iflib framework instead.

The iflib framework is documented here: https://man.freebsd.org/cgi/man.cgi?query=iflib&sektion=9&manpath=FreeBSD+13.3-RELEASE+and+Ports

The iflib callbacks are specified when the driver is registered. An additional callback, device_register, provides a pointer to a structure. For example:

static device_method_t sample_methods[] = {
    DEVMETHOD(device_register,      sample_iflib_register),
    DEVMETHOD(device_probe,         iflib_device_probe),
    DEVMETHOD(device_attach,        iflib_device_attach),
    DEVMETHOD(device_detach,        iflib_device_detach),
    DEVMETHOD(device_shutdown,      iflic_device_shutdown),

    DEVMETHOD_END
};
    

The source code for several drivers that use the iflib framework is available from FreeBSD (e.g., igc, e1000, ixgbe, ixl).

Allocating resources

Use bus_alloc_resources() to allocate resources from a parent bus. For example:

...
    bus_alloc_resources(dev, sam_spec, sc->res);
...

The res argument is an array of type struct that is used for both mapping the memory region used by the device and interrupt mapping. The size of the array depends on the actual number of memory regions and interrupts the hardware uses. The sam_spec argument is an array that holds the description of a specific resource type. For example:

static struct resource_spec sam_spec[] = {
        { SYS_RES_MEMORY,       0,      RF_ACTIVE },
        { SYS_RES_IRQ,          0,      RF_ACTIVE },
        RESOURCE_SPEC_END
};

After bus_alloc_resources() is executed, the third parameter (RF_ACTIVE) holds an address pointer to a memory resource or region, or a pointer for an interrupt resource. This parameter can be passed to bus_setup_intr().

Registering interrupts

Use bus_setup_intr() to perform interrupt registration within sam_attach(). The following example specifies the interrupt handler for an interrupt that has been allocated by bus_alloc_resources():
bus_setup_intr(dev, sc->res[1], INTR_TYPE_NET | INTR_MPSAFE, NULL,
                    sam_intr, sc, &sc->intr_cookie);

The &sc->intr_cookie argument is a pointer to a void pointer that bus_setup_intr() uses if it successfully establishes an interrupt.

The interrupt filter and handler run in a QNX Interrupt Service Thread. Both the filter and handler functions run in the same context. Unlike FreeBSD, io-sock places no additional restrictions on what a filter function can do versus what a handler function can do. If a filter is specified, it must return FILTER_SCHEDULE_THREAD if it wants the optional handler to run; otherwise, it can return FILTER_HANDLED. Most drivers only specify a handler.

After the interrupt has fired, the driver should acknowledge the interrupt with the hardware. The masking and unmasking of the interrupt is handled by io-sock.

Specifying the memory for Direct Memory Access

Direct Memory Access (DMA) improves performance by transferring data without involving the CPU. A DMA transaction can transfer data between a device and memory, a device and another device, or memory and memory. You specify the memory for DMA transactions using the following three tasks:
  • Create a memory tag
  • Allocate memory
  • Load the memory map of physical addresses according to hardware-specific memory constraints
For FreeBSD DMA documentation, go to https://man.freebsd.org/cgi/man.cgi?query=bus_dma&sektion=9&manpath=FreeBSD+13.3-RELEASE+and+Ports.

Creating a memory tag

A DMA memory tag is a machine-dependent opaque type that describes the characteristics of DMA transactions. These tags are organized into a hierarchy. Because each child tag inherits the restrictions of its parent, all devices along the path of DMA transactions contribute to the constraints that apply. For example:
error = bus_dma_tag_create(
                bus_get_dma_tag(sc->dev),       /* Parent tag. */
                1, 0,                           /* alignment, boundary */
                BUS_SPACE_MAXADDR_32BIT,        /* lowaddr */
                BUS_SPACE_MAXADDR,              /* highaddr */
                NULL, NULL,                     /* filter, filterarg */
                MCLBYTES, TX_DMA_MFUF_CHUNK,    /* maxsize, nsegments */
                MCLBYTES,                       /* maxsegsize */
                0,                              /* flags */
                NULL, NULL,                     /* lockfunc, lockarg */
                &sc->txbuf_tag);
                

Allocating memory

Memory can be allocated as a single segment using bus_dmamem_alloc(). For example:
bus_dmamem_alloc(sc->txdesc_tag, (void**)&sc->txdesc_ring,
    BUS_DMA_COHERENT | BUS_DMA_WAITOK | BUS_DMA_ZERO, &sc->txdesc_map);
Then, the initial load operation is required to obtain the bus address of the allocated memory. For example:
bus_dmamap_load(sc->txdesc_tag, sc->txdesc_map, sc->txdesc_ring,
    TX_DESC_SIZE, sam_get1paddr, &sc->txdesc_ring_paddr, 0);
where sam_get1paddr is a pointer to a callback that returns the physical address of that segment.
Alternatively, you can use bus_dmamap_create(), which allocates and initializes a DMA map. For example:
for (idx = 0; idx < TX_MAP_BUFFER_LEN; idx++)
    bus_dmamap_create(sc->txbuf_tag, BUS_DMA_COHERENT, &sc-txbuf_map[idx].map);

Load the memory map of physical addresses

After either the hardware or the CPU writes to the memory they share, synchronization needs to be performed by calling bus_dmamap_sync(). In the following example, bus_dmamap_sync() is called after the hardware has finished writing data to the receive buffer and returns the buffer to io-sock so that it can access that memory:
bus_dmamap_sync(sc->rxdesc_tag, sc->rxdesc_map, BUS_DMASYNC_POSTWRITE | BUS_DMASYNC_POSTREAD);

For a description of all the available memory synchronization operation specifiers, see https://www.freebsd.org/cgi/man.cgi?query=bus_dmamap_sync&sektion=9&manpath=FreeBSD+13.3-RELEASE+and+Ports.

DMA shutdown handlers

Any driver that uses DMA needs to implement a shutdown handler method that stops all DMA. It is called when io-sock terminates (either terminating normally or crashing). Without a shutdown handler, memory that was reserved for the DMA may continue to be modified by the hardware even though the system now considers this freed memory. This situation can corrupt any memory that is then provided by the system at the same memory location.

This function should shut down DMA in the quickest and simplest way possible (e.g., reset the device) and ignore other resources (e.g., memory) because those resources are cleaned up automatically when the process terminates. A more complete shutdown is done with detach().

Initialize a network interface

A driver needs to allocate and initialize an ifnet structure for each network interface. To avoid compatibility issues, use functions to access the ifnet structure members. For example:
struct ifnet *ifp;
...
sc->ifp = ifp = if_alloc(IFT_ETHER);
if_setsoftc(ifp, sc);
if_initname(ifp, device_get_name(dev), device_get_unit(dev));
if_setflags(ifp, IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST);
if_setcapabilities(ifp, IFCAP_VLAN_MTU);
if_setcapenable(ifp, if_getcapabilities(ifp));
if_settransmitfn(ifp, sam_transmit);
if_setqflushfn(ifp, sam_qflush);
if_setioctlfn(ifp, sam_ioctl);
if_setinitfn(ifp, sam_init);
if_setsendqlen(ifp, TX_DESC_COUNT – 1);
if_setsendqready(ifp);
if_setifheaderlen(ifp, sizeof(struct ether_vlan_header);
ether_ifattach(ifp, macaddr);            
where:
  • sam_transmit is a driver-defined asynchronous transmit callback. The callback receives a pointer to a mbuf structure that the driver should transmit (see Transmitting a packet).
  • sam_ioctl is a driver-defined I/O control handler. Possible I/O control codes are defined in sockio.h.
  • sam_qflush is a driver-defined synchronous callback that should flush packet queues.
  • sam_init is a driver-defined synchronous callback that should initialize and bring up the hardware (e.g., reset the chip and the watchdog timer, and enable the receiver unit). Because it is called on a per-interface basis, it receives a pointer to an if_softc structure as a parameter.

Attaching a PHY driver with MII bus

You can implement PHY-specific handling within a PHY-specific driver, and load it using mii_attach(). For example:
mii_attach(dev, &sc->miibus, ifp, sam_media_change,
    sam_media_status, BMSR_DEFCAPMASK, phynum,
    MII_OFFSET_ANY, MIIF_FORCEANEG);
The MAC driver has to provide implementations for PHY read, write, and status change callbacks as part of driver module initialization (see Registering the driver module). For example, in the device_method_t example provided above:
static device_method_t sam_methods[] = {
        DEVMETHOD(device_probe,         sam_probe),
        DEVMETHOD(device_attach,        sam_attach),
        DEVMETHOD(device_detach,        sam_detach),
        DEVMETHOD(device_shutdown,      sam_shutdown),

        /* MII Interface */
        DEVMETHOD(miibus_readreg,       sam_miibus_read_reg),
        DEVMETHOD(miibus_writereg,      sam_miibus_write_reg),
        DEVMETHOD(miibus_statchg,       sam_miibus_statchg),

        DEVMETHOD_END
};
  • miibus_statchg is a synchronous callback called by the MII bus driver when the physical layer establishes a link. It sets the MAC interface registers. It should respond to media changes according to mii->mii_media_active and mii->mii_media_status bitmaps and configure MAC registers accordingly.
  • miibus_writereg and miibus_readreg are synchronous write and read PHY registers.

The io-sock networking stack includes the MII bus framework and the mods-phy.so module contains several built-in PHY drivers. The external PHY driver devs-samplephy.so can be added. To use the probe helper functions, you need to add the PHY descriptor defines to the PHY driver. For example:

#define MII_OUI_xxSAMPLE          0x00aaaa
#define MII_MODEL_xxSAMPLE_2A     0x002a
#define MII_STR_xxSAMPLE_2A      "Sample PHY"

static const struct mii_phydesc samplephys[] = {
        MII_PHY_DESC(xxSAMPLE, 2A),
        MII_PHY_END
};
static int
samplephy_probe(device_t dev)
{
        return (mii_phy_dev_probe(dev, samplephys, BUS_PROBE_DEFAULT));
}

Receiving a packet

Packet reception starts with an interrupt from the hardware. After processing, filled received packets are drained from the hardware, new empty packets are passed to the hardware, and the filled received packets are passed to io-sock. For example (from sam_rxfinish_one()):

if_input(ifp, m);

Transmitting a packet

The transmit callback is called by io-sock and is registered early in the attach phase. In addition, a pointer to a transmit queue flushing function should be provided to io-sock. For example (from sam_attach()):
if_settransmitfn(ifp, sam_transmit);
if_setqueueuflushfn(ifp, sam_qflush);

The transmit callback should be implemented asynchronously. Generally speaking, the driver first needs to determine if the hardware resources to transmit a packet are available (descriptors, buffers, etc.). If the hardware runs out of transmit resources, it should return from the transmit function and transmit the packet when those resources become available.

Note:
The if_setstartfn() function is deprecated and should not be used in new drivers.

Working with mbuf chains

An mbuf (short for memory buffer) is a basic unit of memory management that is used to store network packets and socket buffers. A network packet may span multiple mbufs arranged into a mbuf chain (linked list), which allows adding or trimming network headers with little overhead.

To avoid compatibility issues with future versions, QNX recommends that you don't modify mbuf internals when you develop an io-sock driver. However, it is useful to understand the general structure of an mbuf, which is defined in include/devs/sys/mbuf.h.

The following example allocates an mbuf:
struct mbuf *m;

m = m_getcl(M_NOWAIT, MT_DATA, M_PKTHDR);
(For flag definitions, see https://www.freebsd.org/cgi/man.cgi?query=mbuf&sektion=9&manpath=FreeBSD+13.3-RELEASE+and+Ports or mbuf.h).

The bus_dmamap_load_mbuf_sg() function allows you to map mbuf chains for DMA transfers. In the following example, an mbuf chain is received in a transfer callback and mapped into a DMA memory transaction for the hardware:

bus_dmamap_load_mbuf_sg(sc->txbuf_tag,
                sc->txbuf_map[sc->txbuf_idx_head].map, m0, seg, &nsegs, 0);

The seg argument specifies a scatter/gather segment array that the caller provides and the function fills in. The nsegs argument is returned with the number of segments filled in.

If the bus_dmamap_load_mbuf_sg() call above fails, you can collapse (i.e., defragment) the mbuf chain into a smaller number of segments and try again. For example:

m_collapse(m0, M_NOWAIT, TX_DMA_MFUF_CHUNK);

If the hardware does not support scatter/gather addressing, you can “collapse” the mbuf chain to a contiguous buffer. This method is slower. For example:

m_defrag(m0, M_NOWAIT);

Modifying the makefile

Add include devs/devs.mk to the end of common.mk, as shown in the following example common.mk, which adds include paths for io-sock header files:

ifndef QCONFIG
QCONFIG=qconfig.mk
endif
include $(QCONFIG)

define PINFO
PINFO DESCRIPTION=Sample io-sock driver
endef

include devs/devs.mk

Loading an io-sock driver

There are two ways you can load the driver:

  • Load io-sock first and use mount to load the driver later. For example:
    io-sock -m phy -m fdt
    ...
    mount -T io-sock devs-sample.so
    
  • Load the driver when you start io-sock. For example (you can specify the driver without the devs-* prefix and .so extension):
     io-sock -m phy -m fdt -d sample
This example specifies drivers that support a Flattened Device Tree bus, direct memory access driver (-m phy -m fdt). Other bus types require different drivers (e.g., -d phy -d pci or -d phy -d usb). For more information, see Starting io-sock and driver management.
Page updated: