Filesystems and Power Failures

This technote includes:

Introduction

How do we make sure that the hard disk integrity is maintained during power failures?

The DOS, EXT2, and QNX 4 filesystems aren't currently power-safe. Therefore file or partition corruption on a hard drive could occur during power failures. However a number of measures can be implemented to try to reduce the amount of eventual HDD corruption.

The Power-Safe (fs-qnx6.so) filesystem uses a copy-on-write (COW) technique to always maintain an uncorrupted version of the filesystem, even if a power failure occurs. For more information, see the Filesystems chapter of the System Architecture guide.

The rest of this document explains:

Guidelines for using hard drives in an environment subject to abrupt power failures

A lot of work in recent years went into the QNX 4 filesystem design to increase its ability to be resistant to power-failure scenarios (through the costly use of synchronised writes to enforce ordering), but it wasn't designed to handle corrupted sectors (either real bad blocks or virtual ECC ones).

This type of corruption during power failures occurs only when a lack of atomic sector-writes causes an ECC error to manifest as a bad block. On hard drives that retain valid sector contents, when a multiple sector metadata update is required, the filesystems use ordered writes to ensure that if power is lost (and thus only some writes were made), then they err on the side of lost resources rather than filesystem corruption.

For example, when growing a file in the QNX 4 filesystem, the blocks are first marked as used in the bitmap (a write to the bitmap) and then assigned to the file (a write to the inode). If a power failure occurs, there could be some blocks marked in the bitmap that don't belong to any file; the alternative ordering would have the file using free blocks in the bitmap which could then be allocated to another file, which results in a cross-linked file corruption.

Our recent investigations do show possible physical I/O errors (ECC errors) with loss of data. It's very difficult for the current QNX 4 filesystem, which overwrites metadata in-place, to prevent this situation.

Based on our investigation of multiple examples of corrupted filesystems, we have compiled this document with recommendations that should help limiting catastrophic damage to the hard disk (i.e. not being able to mount it).

Implementing some of the strategies, described below, to limit the catastrophic filesystem corruption and the associated recovering procedures should be less risky to projects than introducing radical filesystem changes.

Recipe for creating hard drive corruption

Hard drive corruption always occurs during a power failure (e.g. crank scenario or dead battery and stopping the alternator, large capacitors not available) while physically writing into a file or a directory, as opposed to writing to the driver cache or the drive cache.

Hard drive block corruption is a generic problem for drives that don't offer an atomic sector update guarantee. HDDs that do offer atomic sector update capabilities either leave the original data unchanged or completely write the new block content. With HDDs that don't offer this capability, half a block could be written and interrupted by an emergency head unload, which then becomes unreadable because the ECC doesn't match. It could affect any block and make it appear to be a bad I/O error. Our QNX 4 disk filesystem makes no guarantee in the presence of physical I/O errors of this type.

Various types of hard drive corruption could occur, depending on the scenario:

Corruption Effects
File corruption Loss of data in the file, or the inability to open the file. If this happens to data or configuration files, then some systems might not be able to restart themselves.
Directory corruption Loss of files
Root block corruption Inability to mount the disk partition or boot from it.
.inode corruption Loss of files, or loss of long filenames.
.bitmap corruption Inability to grow or delete files

How to limit the possible hard drive corruption

There are several ways to reduce the amount of hard drive corruption during a power failure. Avoiding writing to a file in the hard drive as much as possible or mounting a partition read-only is obviously the best way to prevent any corruption. However if writing to a hard disk can't be avoided, there are a few guidelines that will help reduce (but not completely eliminate) catastrophic corruption:

How to repair hard disk corruption

Our disk filesystems make no guarantee in the presence of physical bad blocks (which is what the power failure results in). This type of IO errors isn't handled at the driver or filesystem level, but they could trigger a notification from the block driver (devb-*) to a user application that would attempt to repair the error (by writing to the bad block) or, in the worst case, reformat the disk. This would require a high-level application to monitor these errors and repair using application knowledge. The disk filesystems can't magically recover from any/all physical bad blocks.

The application layer should also determine whether the system was shut down correctly or not and take corrective measures as necessary. For example, it could start chkfsys, a useful recovery utility that checks the disk integrity:

# chkfsys -v -f -m /dev/hdxtxx

For more information, see its entry in the Utilities Reference.

Power failures while writing

What happens when we switch the power off, and files are still open for write access? Do we get invalid files / bad blocks?

If the power is physically switched off without taking the proper precautions the following could happen:

For more information, see the Backing Up and Recovering Data chapter of the QNX Neutrino User's Guide.