Block I/O support
driver [blk option[,option…]] [fstype [options]]
The driver is one of the devb-* drivers, such as devb-eide, and option is one of the options described below.
The optional fstype argument is one of the filesystem drivers (fs-*); you can follow it with options specific to the filesystem.
You can specify the memory sizes used by io-blk.so with any of the following suffixes:
- b — bytes
- k — kilobytes
- m — megabytes
- p — pages
- % — percent of the total amount of cache, alloc, etc., depending on the option
You can specify the following options only in the blk section:
- Set the cache/memory allocation policy to one of the following:
- cache (the default) — allocate all of the buffer cache (the cache=size) at startup, but allocate all other caches (e.g. names) on demand and let them grow to their specified limit, and then start LRUing.
- demand — allocate the buffer cache the same way, on-demand (it will grow from 0 to the size specified by the cache option as you access the disk).
- upfront — pregrow all caches to their full size. This option can be useful in RAM-tuning a system, to see how much memory the filesystem will eventually consume (things such as the name and vnode caches tend to grow over time).
- Set the amount of automounting to be performed; amount is one
- none — only raw block devices appear.
- partition — enumerate any partition tables.
The default is partition.
- Create a mountpoint for dev at mountpoint.
If you don't specify a full path for the device, io-blk.so
uses the value of its devdir option as a prefix.
For example, if devdir is /dev (the
default), an option of automount=hd0t77:/disk mounts
/dev/hd0t77 at /disk.
The optional fstype specifies the filesystem type, after which you can set options. The choices of filesystem and the associated shared objects are:
If not specified, the library tries to determine the filesystem automatically.
If the @filename version of this option is used, the automounts are as specified in the given file. The file is a list of mounts (using the same syntax as above), separated by newline characters or commas.
You can't locate the filename file in the filesystem to be automounted: it has to be available in an existing filesystem such as the image filesystem. Optionally, you could locate it in any devb filesystem that's already running.
To mount multiple filesystems on a (removable) device, specify that the device is shared with a + prefix. For example,
For a list of common partition types, see the Filesystems chapter of the System Architecture guide.
- The size, in bytes, of the smallest and largest physical sector. The default is 512:8K.
- The total in-memory cache size allowed. Cache memory is allocated as necessary beyond the initial amount specified by the alloc= option until the total size is reached. The memory size uses the suffixes described above; the default is 15% of system RAM.
- Specify the delay time for delayed writes on fixed and removable media. A dirty disk block remains in the cache without being physically written to the disk for up to delay1 or delay2 seconds. The default is 2 seconds for fixed media, and 1 second for removable media. For more information, see “Controlling writing operations,” below.
- The directory in which io-blk presents the physical devices as block-special files. The default is /dev.
- Controls how major device numbers are requested; type
is one of:
- name — use the name of the device (e.g. hd, cd).
- class — use the CAM class of the device (e.g. direct, readonly).
- common — use a single class for all block devices.
- Set the order for enumerating disk partitions; one of the following:
- forward — enumerate slots 1 through 4, followed by any extended partitions (the default).
- reverse — enumerate slots 4 through 1, followed by any extended partitions.
- windows — enumerate the active partitions, followed by any extended partitions, and then non-booting primary partitions. For more information about this order, see http://support.microsoft.com/kb/q51978/.
- Require/obtain exclusive access of the mount device.
- Specify the storing of open file names for the iofdinfo()
query. The options for mode are:
- Try to reconstruct the file name from the contents of the directory name cache. Don't rely on this option to supply the names of all open files (a file's name is supplied only if all components of its pathname are in the name cache).
- The default. Store the name used in each open() call to ensure that this name is always available.
- Never supply the name of an open file.
- Set the number of entries in a cache used to map translations from logical blocks to physical ones. If this option isn't specified, the size is based in the value of the vnode option.
- Set the device/partition naming scheme. The default is 0#. For more information, see “Naming schemes,” below.
- Specify a name cache of size entries. Using more name cache entries speeds up path/file lookups at the expense of memory. Setting the size to 0 disables name caching. If this option isn't specified, the size is determined from the vnode option.
- Disable asynchronous iodone processing, doing the processing instead in the context of the driver thread. By default, this is handled by a dedicated thread.
- Keep a dirty disk block in memory for time seconds if it's being continuously modified before physically writing it to the disk. Setting the postpone= option ensures that the continuously modified block gets written to disk periodically (every time seconds). The default is determined from the delwri option.
- Set the priority of periodic filesystem callouts. The default is 21.
- Set the number of protected extra LRU passes. The default is 2.
- Set the minimum and maximum size of the read-ahead buffers. The default is "4k:64k".
- Create an internal ramdisk device (/dev/ramX) of the specified size. The size variable can use the suffixes described above. The initial contents of this memory device are unspecified, so it must be formatted before use as a filesystem (see dinit in the Utilities Reference).
- The polling period, in seconds, for removable media (default: 0).
- Specify a removable media timeout of delay seconds (default: 2 seconds). After delay seconds of inactivity, a disk access prompts validation of the media with the driver; if it reports that the media has been changed, all data blocks and cached information for that device are discarded and relearned.
- Set the thread pool parameters (maximum, low water, and high water). The default is 12:2:5.
- Be verbose.
The output is sent to the system logger,
The optional level argument is a series of alphabetic characters that indicates the categories of event to log:
- b — bad blocks
- c — configuration
- i — input
- o — output
- d — direct I/O
- r — removable
- v — virtual filesystem (VFS)
- f — fsys module (fs-*)
An option of blk verbose means all, blk verbose=io means input plus output, blk verbose=!r means everything except removable, and so on. The default is bcdfiorv.
- Specify the number of vnode entries (filesystem-independent inodes) The default is 1280 entries. Up to size vnodes may be active. Vnodes remain in this cache when the corresponding file is closed, making subsequent opens faster.
- Set the amount of cache that may be occupied by a single file. This option is used to prevent the “cache wiping” phenomenon, where reading a large file may flush a large proportion of buffer cache. The size can use the suffixes described above; the default is 100% (i.e. no limit is enforced).
You can apply the following options globally (in the blk section) or to a specific filesystem (for example, in the qnx4 section for a QNX 4 filesystem):
- Mount the filesystem so that it's resolved after any other filesystems mounted at the same pathname (in other words, it's placed behind any existing mount). When you access a file, the system looks on this filesystem last, and only if the file wasn't found on any other filesystems.
- Mount the filesystem so that it's resolved before any other filesystems mounted at the same pathname (in other words, it's placed in front of any existing mount). When you access a file, the system looks on this filesystem first.
- Set the committing level of the filesystem, which controls how dirty system/user blocks are written to disk. The level is one of none, low, medium (the default), and high. If it's none, all writes are time-delayed (as specified by the delwri option); at high, all writes are performed synchronously. For more information, see “Controlling writing operations,” below.
- Set the action to perform when a fs-* filesystem module
detects an internal error. The action is one of:
- ebadfsys — simply return EBADFSYS to the client.
- mountro — return EBADFSYS to the client and remount the affected filesystem as read-only.
The default is ebadfsys.
- Set the filesystem-dirty marking behavior. The mode must be none or mount (the default). If marking is on, the filesystem is marked as being dirty when it's mounted, and it's marked as being clean when it's unmounted. The method of marking depends on the filesystem.
- Update/don't update the file's directory entry if the only change is the access time. The noatime option isn't strict POSIX 1003.2 behavior, but it's faster.
- Allow/don't allow files to be created on this filesystem.
- Allow/don't allow file execution from this filesystem.
- Lock/don't lock removable media. If locked, the medium is treated as fixed.
- Don't/do allow invalid mounts on removable media (re-insert).
- Ignore/don't ignore the set-user ID bit on files in this filesystem.
- Mount all drives/filesystems as read-only.
- Mount all drives/filesystems as read-write (if the physical media permit). This is the default.
For more information about the before and after options, see “Ordering mountpoints” in the Process Manager chapter of the System Architecture guide.
The default values of the map and ncache options are based on the value of the vnode option. This arrangement lets you configure a system by specifying the cache size and the number of files, and letting the library set the other options.
There are various types of writing operations:
- Synchronous (SYNC)
- Start immediately and wait for completion.
- Asynchronous (ASYNC)
- Start immediately but don't wait for completion.
- Delayed (DELWRI)
- Don't start until after a timeout period and then perform as asynchronous. The blk delwri= option controls the timeout for the delayed format; if you set this option to 0, a delayed writing operation is the same as asynchronous.
- As required
- Write only if you have to.
The types of data include:
- What you read() and write().
- Things associated with stat(), such as times and IDs.
- Things such as bitmaps, extents, etc.
If a file has no links, the “as required” form of write operation is used, never going to disk unless the buffer or cache is needed (since the file has no links, the data isn't expected to be accessible after a power failure). If you open a file with O_SYNC, the synchronous format is always used.
Otherwise, the blk commit level controls the type of write to use for each level of data:
|commit=||Filesystem data||Metadata||User data|
|If you specify commit=none, you lose all write ordering (both for single multiblock updates and multiple-user operations). Hence, your chances of a useful recovery following a power failure are poor. We recommend that you use this option only if you have a uninterruptible power supply (UPS), or if you don't mind using dinit on your filesystem as a recovery tool.|
Calling close() might force a metadata update, but does nothing to the user data. Calling fsync() always forces out any delayed-write blocks for the file, and so is useful only when commit isn't high.
You can use the naming=scheme option to specify the naming scheme to use for devices and partitions. The format of scheme is as follows:
- 0# (where 0 is any digit and sets the first/base number)
- The raw devices are named 0, 1, and so on,
and partitions are named from the
device with a t followed by the OS type of the partition (see
in the Filesystems chapter of the System Architecture guide).
For example, a QNX partition could be named hd0t77.
For duplicate partitions, a period (.) and sequence number are appended (e.g. hd0t12, hd0t12.1, and hd0t12.2 for logical/extended DOS partitions). This is the QNX Neutrino naming scheme.
- 0a (actually any digit and any letter; these set the first/base name)
- The raw devices are named 0,1,..., and partitions are named a, b, and so on (e.g. /dev/hd0, /dev/hd0a, /dev/hd0b, /dev/hd0c, and so on). The name doesn't indicate the OS type of the partitions, just the order in which they were found.
- a1 (actually any letter and any digit; these set the first/base name)
- The raw devices are named a, b, and so on.
Primary partitions are named 1, 2, 3, and 4;
if you don't have four of them, the unused numbers are skipped.
Any extended partitions are numbered without gaps from 5 (e.g.
/dev/hda, /dev/hda1, /dev/hda2,
/dev/hda5, and so on).
The name doesn't indicate the OS type of the partition, just its location. This is the Linux naming scheme.
The default naming scheme is 0#.
|Change to a different naming scheme at your own risk:
Filesystems chapter of the System Architecture guide