Writing Network Drivers for io-sock
An io-sock network driver can be viewed as the glue
between the
underlying network hardware and the software infrastructure of
io-sock. You can picture the driver as having a bottom and top
half: the bottom is code specifically for the particular hardware it supports, and the
top is code specifically for io-sock.
This appendix deals specifically with the top half of the driver, which deals with the io-sock software infrastructure.
Driver APIs
The io-sock networking stack is derived from FreeBSD's networking
stack and io-sock drivers are written using the FreeBSD driver
APIs. In the following discussion and examples, the API function calls and
structures (up to and including Modifying the makefile
) are the same as FreeBSD. This common API
allows you to compile FreeBSD driver source code for io-sock with
few to no code changes. However, this design also means that the
io-sock build environment is different from a typical QNX
build environment.
When you are developing a network driver for io-sock, you must include io-sock header files (found in usr/include/devs), which include the required QNX header files. Don't include the QNX header files directly—it can result in compilation errors.
For example, some common functions are quite different when writing io-sock drivers. The following APIs are defined in io-sock header files:
void* malloc(size_t size, struct malloc_type *type, int flags);
void free(void *addr, struct malloc_type *type);
QNX header files define APIs with the same names and a different syntax:
void* malloc( size_t size );
void free( void* ptr );
You can add the io-sock version of the header files by including
devs.mk, which specifies the proper search paths, in your
common.mk (see Modifying the makefile
) .
For the best guide to the use of the FreeBSD driver APIs, go to the FreeBSD 13.3 manual pages at https://www.freebsd.org/cgi/man.cgi and search in the section 9 - Kernel Interface.
Driver versioning
To work with io-sock, a driver must specify its version using IOSOCK_VERSION_CUR, which is located in qnx_modload.h. The io-sock networking manager checks the version value when the driver loads, and rejects incompatible driver libraries. A complete list of available versions is included in qnx_modload.h.
The driver must include a SYSUNINIT macro to provide a function that is invoked when a driver is unmounted. This function can be empty. The optional SYSINIT macro can be used to specify an initialization function, if required.
For example:
…
#include <qnx/qnx_modload.h>
…
int drvr_ver = IOSOCK_VERSION_CUR;
SYSCTL_INT(_qnx_driver, OID_AUTO, sam_drvr, CTLFLAG_RD, &drvr_ver, 0,
"Version");
…
struct _iosock_module_version iosock_module_version = IOSOCK_MODULE_VER_SYM_INIT;
static void
sam_uninit(void *arg)
{
}
SYSUNINIT(sam_uninit, SI_SUB_DUMMY, SI_ORDER_ANY, sam_uninit, NULL);
On a running target, you can use sysctl to display driver versions. For example:
sysctl qnx.driver
qnx.driver.em: 65537
qnx.driver.vmx: 65537
qnx.driver.vtnet_pci: 65537
# sysctl qnx.module
qnx.module.usb: 65537
qnx.module.pci: 65537
qnx.module.phy: 65537
Registering the driver module
The DRIVER_MODULE macro declares and registers the driver with the system. The driver is added to the list of device drivers for a particular bus type (e.g.simplebus
, pci
, uhub
(USB),
nexus
). It is possible to write a driver that can be registered
against multiple bus types.
DRIVER_MODULE(sample, simplebus, sample_driver, sam_devclass, 0, 0);
DRIVER_MODULE(miibus, sam, miibus_driver, miibus_devclass, 0, 0);
DRIVER_MODULE(etherswitch, sam, etherswitch_driver, etherswitch_devclass, 0, 0);
As shown in the example, the DRIVER_MODULE macro can be used multiple
times in a driver to describe a device hierarchy. A common example of using the macro
this way is using the MII bus framework to attach a PHY driver. Another, less common
example is the EtherSwitch framework that can be used for Ethernet switch drivers. Both
frameworks include device drivers that create child devices under the sample device. You
can use the devinfo utility to see these devices. For
example:# devinfo
nexus0
cryptosoft0
ofwbus0
simplebus0
sam0
miibus0
ukphy0
etherswitch0
The MODULE_DEPEND macro should also be used to show the driver dependencies (excluding library API version dependencies). For example:
MODULE_DEPEND(sample, ether, 1, 1, 1);
MODULE_DEPEND(sample, mibus, 1, 1, 1);
MODULE_DEPEND(sample, etherswitch, 1, 1, 1);
Although
io-sock does not include explicit dependency checks, drivers
won't load if there are missing dependencies. A driver’s declared dependencies can be
extracted from the binary using objdump. For example:
$ objdump aarch64-dll.le.diag/devs-sample-diag.so -t | grep depend_on
0000000000008050 l O .data 000000000000000c _sample_depend_on_etherswitch
0000000000008060 l O .data 000000000000000c _sample_depend_on_miibus
0000000000008070 l O .data 000000000000000c _sample_depend_on_ether
0000000000008080 l O .data 000000000000000c _etherswitch_sam_depend_on_kernel
0000000000008090 l O .data 000000000000000c _miibus_sam_depend_on_kernel
00000000000080a0 l O .data 000000000000000c _sample_simplebus_depend_on_kernel
sample_driver
), which
describes the driver. It provides the name of the device, a pointer to a list of
methods, and the size of the private data (softc
) required by the
driver for each device instance. For example:
driver_t sample_driver = {
"sam",
sam_methods,
sizeof(struct sam_softc),
};
The driver methods are contained in a device_method_t structure, which provides probe, attach, detach, and shutdown callback methods. The following example includes the callback methods for the MII bus and EtherSwitch framework:
static device_method_t sam_methods[] = {
/* device callbacks */
DEVMETHOD(device_probe, sam_probe),
DEVMETHOD(device_attach, sam_attach),
DEVMETHOD(device_detach, sam_detach),
DEVMETHOD(device_shutdown, sam_shutdown),
/* miibus callbacks */
DEVMETHOD(miibus_readreg, sam_miibus_read_reg),
DEVMETHOD(miibus_writereg, sam_miibus_write_reg),
DEVMETHOD(miibus_statchg, sam_miibus_statchg),
#ifdef INCLUDE_ETHERSWITCH
/* etherswitch callbacks */
DEVMETHOD(etherswitch_getinfo, sam_es_getinfo),
DEVMETHOD(etherswitch_readreg, sam_es_readreg),
DEVMETHOD(etherswitch_readphyreg, sam_miibus_read_reg),
#endif
DEVMETHOD_END
};
The supported callbacks can be found in the generated interface header file (e.g.,
device_if.h, miibus_if.h,
etherswitch_if.h). A driver does not have to provide an
implementation for all the callbacks. An implementation for a callback method may be
provided by the parent bus or another bus further up the device tree. The framework
also includes a default implementation. An example of a method that is provided by
the parent bus is the device_identify
method. A driver may include
a device_identify
method if it is registered to a bus such as the
nexus0
, which cannot identify devices. For example, to identify
a virtual device or a memory mapped device that is not in the device tree blob
(DTB).
If a driver supports multiple bus types, another instance of device_method_t with a different set of methods can be specified.
For more information, see the following FreeBSD documentation:
- driver (https://www.freebsd.org/cgi/man.cgi?query=driver&sektion=9&manpath=FreeBSD+13.3-RELEASE+and+Ports)
- DRIVER_MODULE (https://www.freebsd.org/cgi/man.cgi?query=DRIVER_MODULE&sektion=9&manpath=FreeBSD+13.3-RELEASE+and+Ports)
- devclass (https://www.freebsd.org/cgi/man.cgi?query=devclass&sektion=9&manpath=FreeBSD+13.3-RELEASE+and+Ports)
- DEVICE_IDENTIFY (https://man.freebsd.org/cgi/man.cgi?query=DEVICE_IDENTIFY&sektion=9&manpath=FreeBSD+13.3-RELEASE+and+Ports)
The PHY callbacks are discussed in Attaching a PHY driver with MII bus .
Checking if a device is supported
You provide the probe callback as part of driver module initialization (see Registering the driver
module
).
When a device is enabled on a bus, the probe function of every driver registered on that bus is called to determine if the device is supported by that driver. In the case of the sample driver, sam_probe() checks the hardware compatibility string that was assigned to the device in the DTB. For a PCI or USB device, similar functions are available to match the vendor or device identifiers. For example:
static int
sam_probe(device_t dev)
{
if (!ofw_bus_status_okay(dev)) {
return (ENXIO);
}
/*
* Check device's compatibility string against list of
* supported devices
*/
if (ofw_bus_search_compatible(dev, compat_data)->ocd_data == 0) {
return (ENXIO);
}
device_set_desc(dev, "Sample Ethernet Controller");
return (BUS_PROBE_DEFAULT);
}
The value that the probe function returns determines which driver is the best one to use for the device. For example, BUS_PROBE_DEFAULT indicates that the device is a normal device matching a Plug and Play ID and is the normal return value for drivers to use. For a list and description of all the conventional return values, see https://www.freebsd.org/cgi/man.cgi?query=DEVICE_PROBE&sektion=9&manpath=FreeBSD+13.3-RELEASE+and+Ports.
For example, devs-re.so, devs-em.so, and devs-ix.so drivers are loaded. If you add a PCI device, regardless of which driver is used, all the probe functions of the drivers registered against the PCI bus are called (re_probe(), em_probe(), ix_probe()). The return values indicate that devs-re.so is the best driver to use and the re_attach() function is called. Although typically there is only one match, io-sock supports having multiple drivers that support the same hardware. For example, a system may have a generic driver that supports all PHY devices in addition to more specific drivers.
Because the device’s softc
is re-initialized before every probe, its
contents do not persist when multiple drivers are present. It is not safe to assume
that any information found in the device’s softc
during the probe
will be available when the attach method is called.
Attaching a device
You provide the attach callback as part of driver
module initialization (see Registering
the driver module
).
Allocating resources ,
Registering interrupts ,
Specifying the memory for Direct Memory Access ,
Initialize a network interface,
Attaching a PHY driver with MII bus).
The private data structure for each instance of a device can be fetched by using device_get_softc():
sam_attach(device_t dev) {
struct sam_softc *sc;
sc = device_get_softc(dev);
...
The size of the softc
is specified when the
driver is registered. The software context is automatically allocated and zeroed
when the device is attached.
A node variable set by sam_attach() can provide a handle to a specific place within the DT file where hardware information related to that device is kept. For example:
...
node = ofw_bus_get_node(dev);
...
Example of an Ethernet device in a DT file:
ethernet@4033c000 {
compatible = "Hardware Descriptor string from DTB";
reg = <0x0 0x4033c000 0x0 0x2000 0x0 0x4007c004 0x0 0x4>;
interrupt-parent = <0x1>;
interrupts = <0x0 0x39 0x4>;
interrupt-names = "macirq";
tx-fifo-depth = <0x5000>;
rx-fifo-depth = <0x5000>;
clocks = <0x4 0x2e 0x4 0x2e 0x4 0x38>;
clock-names = "stmmaceth", "pclk", "tx";
pinctrl-names = "default";
pinctrl-0 = <0x1c>;
phy-mode = "rgmii";
status = "okay";
The following code from the example sam_attach() reads a string value from a DTB:
if (OF_getprop_alloc(node, "phy-mode", (void **)&phy_mode)) {
if (strcmp(phy_mode, "rgmii") == 0) {
sc->phy_mode = PHY_MODE_RGMII;
}
if (strcmp(phy_mode, "rmii") == 0) {
sc->phy_mode = PHY_MODE_RMII;
}
OF_prop_free(phy_mode);
}
Using iflib for PCI drivers
The sample driver described throughout this section deals with MMIO devices that are described in a DTB. However, for PCI devices, QNX recommends using the iflib framework instead.
The iflib framework is documented here: https://man.freebsd.org/cgi/man.cgi?query=iflib&sektion=9&manpath=FreeBSD+13.3-RELEASE+and+Ports
The iflib callbacks are specified when the driver is
registered. An additional callback, device_register
, provides a
pointer to a structure. For example:
static device_method_t sample_methods[] = {
DEVMETHOD(device_register, sample_iflib_register),
DEVMETHOD(device_probe, iflib_device_probe),
DEVMETHOD(device_attach, iflib_device_attach),
DEVMETHOD(device_detach, iflib_device_detach),
DEVMETHOD(device_shutdown, iflic_device_shutdown),
DEVMETHOD_END
};
The source code for several drivers that use the iflib framework is available from FreeBSD (e.g., igc, e1000, ixgbe, ixl).
Allocating resources
Use bus_alloc_resources() to allocate resources from a parent bus. For example:
...
bus_alloc_resources(dev, sam_spec, sc->res);
...
The res argument is an array of type struct that is used for both mapping the memory region used by the device and interrupt mapping. The size of the array depends on the actual number of memory regions and interrupts the hardware uses. The sam_spec argument is an array that holds the description of a specific resource type. For example:
static struct resource_spec sam_spec[] = {
{ SYS_RES_MEMORY, 0, RF_ACTIVE },
{ SYS_RES_IRQ, 0, RF_ACTIVE },
RESOURCE_SPEC_END
};
After bus_alloc_resources() is executed, the third parameter (RF_ACTIVE) holds an address pointer to a memory resource or region, or a pointer for an interrupt resource. This parameter can be passed to bus_setup_intr().
Registering interrupts
bus_setup_intr(dev, sc->res[1], INTR_TYPE_NET | INTR_MPSAFE, NULL,
sam_intr, sc, &sc->intr_cookie);
The &sc->intr_cookie argument is a pointer to a void pointer that bus_setup_intr() uses if it successfully establishes an interrupt.
The interrupt filter and handler run in a QNX Interrupt Service Thread. Both the filter and handler functions run in the same context. Unlike FreeBSD, io-sock places no additional restrictions on what a filter function can do versus what a handler function can do. If a filter is specified, it must return FILTER_SCHEDULE_THREAD if it wants the optional handler to run; otherwise, it can return FILTER_HANDLED. Most drivers only specify a handler.
After the interrupt has fired, the driver should acknowledge the interrupt with the hardware. The masking and unmasking of the interrupt is handled by io-sock.
Specifying the memory for Direct Memory Access
Direct Memory Access (DMA) improves performance by transferring data without involving the CPU. A DMA transaction can transfer data between a device and memory, a device and another device, or memory and memory. You specify the memory for DMA transactions using the following three tasks:- Create a memory tag
- Allocate memory
- Load the memory map of physical addresses according to hardware-specific memory constraints
Creating a memory tag
A DMA memory tag is a machine-dependent opaque type that describes the characteristics of DMA transactions. These tags are organized into a hierarchy. Because each child tag inherits the restrictions of its parent, all devices along the path of DMA transactions contribute to the constraints that apply. For example:error = bus_dma_tag_create(
bus_get_dma_tag(sc->dev), /* Parent tag. */
1, 0, /* alignment, boundary */
BUS_SPACE_MAXADDR_32BIT, /* lowaddr */
BUS_SPACE_MAXADDR, /* highaddr */
NULL, NULL, /* filter, filterarg */
MCLBYTES, TX_DMA_MFUF_CHUNK, /* maxsize, nsegments */
MCLBYTES, /* maxsegsize */
0, /* flags */
NULL, NULL, /* lockfunc, lockarg */
&sc->txbuf_tag);
Allocating memory
Memory can be allocated as a single segment using bus_dmamem_alloc(). For example:bus_dmamem_alloc(sc->txdesc_tag, (void**)&sc->txdesc_ring,
BUS_DMA_COHERENT | BUS_DMA_WAITOK | BUS_DMA_ZERO, &sc->txdesc_map);
Then,
the initial load operation is required to obtain the bus address of the allocated
memory. For
example:bus_dmamap_load(sc->txdesc_tag, sc->txdesc_map, sc->txdesc_ring,
TX_DESC_SIZE, sam_get1paddr, &sc->txdesc_ring_paddr, 0);
where
sam_get1paddr is a pointer to a callback that returns the
physical address of that segment. for (idx = 0; idx < TX_MAP_BUFFER_LEN; idx++)
bus_dmamap_create(sc->txbuf_tag, BUS_DMA_COHERENT, &sc-txbuf_map[idx].map);
Load the memory map of physical addresses
After either the hardware or the CPU writes to the memory they share, synchronization needs to be performed by calling bus_dmamap_sync(). In the following example, bus_dmamap_sync() is called after the hardware has finished writing data to the receive buffer and returns the buffer to io-sock so that it can access that memory:bus_dmamap_sync(sc->rxdesc_tag, sc->rxdesc_map, BUS_DMASYNC_POSTWRITE | BUS_DMASYNC_POSTREAD);
For a description of all the available memory synchronization operation specifiers, see https://www.freebsd.org/cgi/man.cgi?query=bus_dmamap_sync&sektion=9&manpath=FreeBSD+13.3-RELEASE+and+Ports.
DMA shutdown handlers
Any driver that uses DMA needs to implement a shutdown handler method that stops all DMA. It is called when io-sock terminates (either terminating normally or crashing). Without a shutdown handler, memory that was reserved for the DMA may continue to be modified by the hardware even though the system now considers this freed memory. This situation can corrupt any memory that is then provided by the system at the same memory location.This function should shut down DMA in the quickest and simplest way possible (e.g., reset the device) and ignore other resources (e.g., memory) because those resources are cleaned up automatically when the process terminates. A more complete shutdown is done with detach().
Initialize a network interface
struct ifnet *ifp;
...
sc->ifp = ifp = if_alloc(IFT_ETHER);
if_setsoftc(ifp, sc);
if_initname(ifp, device_get_name(dev), device_get_unit(dev));
if_setflags(ifp, IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST);
if_setcapabilities(ifp, IFCAP_VLAN_MTU);
if_setcapenable(ifp, if_getcapabilities(ifp));
if_settransmitfn(ifp, sam_transmit);
if_setqflushfn(ifp, sam_qflush);
if_setioctlfn(ifp, sam_ioctl);
if_setinitfn(ifp, sam_init);
if_setsendqlen(ifp, TX_DESC_COUNT – 1);
if_setsendqready(ifp);
if_setifheaderlen(ifp, sizeof(struct ether_vlan_header);
ether_ifattach(ifp, macaddr);
where:- sam_transmit is a driver-defined asynchronous transmit callback.
The callback receives a pointer to a mbuf structure that the
driver should transmit (see
Transmitting a packet
). - sam_ioctl is a driver-defined I/O control handler. Possible I/O control codes are defined in sockio.h.
- sam_qflush is a driver-defined synchronous callback that should flush packet queues.
- sam_init is a driver-defined synchronous callback that should initialize and bring up the hardware (e.g., reset the chip and the watchdog timer, and enable the receiver unit). Because it is called on a per-interface basis, it receives a pointer to an if_softc structure as a parameter.
Attaching a PHY driver with MII bus
You can implement PHY-specific handling within a PHY-specific driver, and load it using mii_attach(). For example:mii_attach(dev, &sc->miibus, ifp, sam_media_change,
sam_media_status, BMSR_DEFCAPMASK, phynum,
MII_OFFSET_ANY, MIIF_FORCEANEG);
The MAC driver has to provide
implementations for PHY read, write, and status change callbacks as part of driver
module initialization (see Registering the driver module). For example, in the device_method_t example provided above:
static device_method_t sam_methods[] = {
DEVMETHOD(device_probe, sam_probe),
DEVMETHOD(device_attach, sam_attach),
DEVMETHOD(device_detach, sam_detach),
DEVMETHOD(device_shutdown, sam_shutdown),
/* MII Interface */
DEVMETHOD(miibus_readreg, sam_miibus_read_reg),
DEVMETHOD(miibus_writereg, sam_miibus_write_reg),
DEVMETHOD(miibus_statchg, sam_miibus_statchg),
DEVMETHOD_END
};
- miibus_statchg is a synchronous callback called by the MII bus driver when the physical layer establishes a link. It sets the MAC interface registers. It should respond to media changes according to mii->mii_media_active and mii->mii_media_status bitmaps and configure MAC registers accordingly.
- miibus_writereg and miibus_readreg are synchronous write and read PHY registers.
The io-sock networking stack includes the MII bus framework and the mods-phy.so module contains several built-in PHY drivers. The external PHY driver devs-samplephy.so can be added. To use the probe helper functions, you need to add the PHY descriptor defines to the PHY driver. For example:
#define MII_OUI_xxSAMPLE 0x00aaaa
#define MII_MODEL_xxSAMPLE_2A 0x002a
#define MII_STR_xxSAMPLE_2A "Sample PHY"
static const struct mii_phydesc samplephys[] = {
MII_PHY_DESC(xxSAMPLE, 2A),
MII_PHY_END
};
static int
samplephy_probe(device_t dev)
{
return (mii_phy_dev_probe(dev, samplephys, BUS_PROBE_DEFAULT));
}
Receiving a packet
Packet reception starts with an interrupt from the hardware. After processing, filled received packets are drained from the hardware, new empty packets are passed to the hardware, and the filled received packets are passed to io-sock. For example (from sam_rxfinish_one()):
if_input(ifp, m);
Transmitting a packet
The transmit callback is called by io-sock and is registered early in the attach phase. In addition, a pointer to a transmit queue flushing function should be provided to io-sock. For example (from sam_attach()):if_settransmitfn(ifp, sam_transmit);
if_setqueueuflushfn(ifp, sam_qflush);
The transmit callback should be implemented asynchronously. Generally speaking, the driver first needs to determine if the hardware resources to transmit a packet are available (descriptors, buffers, etc.). If the hardware runs out of transmit resources, it should return from the transmit function and transmit the packet when those resources become available.
Working with mbuf chains
An mbuf (short for memory buffer) is a basic unit of memory management that is used to store network packets and socket buffers. A network packet may span multiple mbufs arranged into a mbuf chain (linked list), which allows adding or trimming network headers with little overhead.
To avoid compatibility issues with future versions, QNX recommends that you don't modify mbuf internals when you develop an io-sock driver. However, it is useful to understand the general structure of an mbuf, which is defined in include/devs/sys/mbuf.h.
struct mbuf *m;
m = m_getcl(M_NOWAIT, MT_DATA, M_PKTHDR);
(For flag definitions, see https://www.freebsd.org/cgi/man.cgi?query=mbuf&sektion=9&manpath=FreeBSD+13.3-RELEASE+and+Ports
or mbuf.h).The bus_dmamap_load_mbuf_sg() function allows you to map mbuf chains for DMA transfers. In the following example, an mbuf chain is received in a transfer callback and mapped into a DMA memory transaction for the hardware:
bus_dmamap_load_mbuf_sg(sc->txbuf_tag,
sc->txbuf_map[sc->txbuf_idx_head].map, m0, seg, &nsegs, 0);
The seg argument specifies a scatter/gather segment array that the caller provides and the function fills in. The nsegs argument is returned with the number of segments filled in.
If
the bus_dmamap_load_mbuf_sg() call above fails, you can
collapse
(i.e., defragment) the mbuf chain into a smaller number of
segments and try again. For example:
m_collapse(m0, M_NOWAIT, TX_DMA_MFUF_CHUNK);
If the hardware does not support scatter/gather addressing, you can “collapse” the mbuf chain to a contiguous buffer. This method is slower. For example:
m_defrag(m0, M_NOWAIT);
Modifying the makefile
Add include devs/devs.mk to the end of common.mk, as shown in the following example common.mk, which adds include paths for io-sock header files:
ifndef QCONFIG
QCONFIG=qconfig.mk
endif
include $(QCONFIG)
define PINFO
PINFO DESCRIPTION=Sample io-sock driver
endef
include devs/devs.mk
Loading an io-sock driver
There are two ways you can load the driver:
- Load io-sock first and use mount to load
the driver later. For
example:
io-sock -m phy -m fdt ... mount -T io-sock devs-sample.so
- Load the driver when you start io-sock. For example (you can
specify the driver without the devs-* prefix and
.so
extension):
io-sock -m phy -m fdt -d sample
Starting io-sock and driver management.